Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device

ABSTRACT

A three-dimensional data encoding method includes: combining first point cloud data and second point cloud data to generate third point cloud data; and encoding the third point cloud data to generate encoded data. The encoded data includes identification information indicating whether each of three-dimensional points included in the third point cloud data belongs to the first point cloud data or the second point cloud data.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a U.S. continuation application of PCT InternationalPatent Application Number PCT/JP2019/036844 filed on Sep. 19, 2019,claiming the benefit of priority of U.S. Provisional Patent ApplicationNo. 62/734,657 filed on Sep. 21, 2018, the entire contents of which arehereby incorporated by reference.

BACKGROUND 1. Technical Field

The present disclosure relates to a three-dimensional data encodingmethod, a three-dimensional data decoding method, a three-dimensionaldata encoding device, and a three-dimensional data decoding device.

2. Description of the Related Art

Devices or services utilizing three-dimensional data are expected tofind their widespread use in a wide range of fields, such as computervision that enables autonomous operations of cars or robots, mapinformation, monitoring, infrastructure inspection, and videodistribution. Three-dimensional data is obtained through various meansincluding a distance sensor such as a rangefinder, as well as a stereocamera and a combination of a plurality of monocular cameras.

Methods of representing three-dimensional data include a method known asa point cloud scheme that represents the shape of a three-dimensionalstructure by a point cloud in a three-dimensional space. In the pointcloud scheme, the positions and colors of a point cloud are stored.While point cloud is expected to be a mainstream method of representingthree-dimensional data, a massive amount of data of a point cloudnecessitates compression of the amount of three-dimensional data byencoding for accumulation and transmission, as in the case of atwo-dimensional moving picture (examples include Moving Picture ExpertsGroup-4 Advanced Video Coding (MPEG-4 AVC) and High Efficiency VideoCoding (HEVC) standardized by MPEG).

Meanwhile, point cloud compression is partially supported by, forexample, an open-source library (Point Cloud Library) for pointcloud-related processing.

Furthermore, a technique for searching for and displaying a facilitylocated in the surroundings of the vehicle by using three-dimensionalmap data is known (for example, see International Publication WO2014/020663).

SUMMARY

There has been a demand for improving coding efficiency in encoding anddecoding three-dimensional data.

The present disclosure provides a three-dimensional data encodingmethod, a three-dimensional data decoding method, a three-dimensionaldata encoding device, or a three-dimensional data decoding device thatis capable of improving coding efficiency.

A three-dimensional data encoding method according to an aspect of thepresent disclosure includes: combining first point cloud data and secondpoint cloud data to generate third point cloud data; and encoding thethird point cloud data to generate encoded data, wherein the encodeddata includes identification information indicating whether each ofthree-dimensional points included in the third point cloud data belongsto the first point cloud data or the second point cloud data.

A three-dimensional data decoding method according to an aspect of thepresent disclosure includes: decoding encoded data to obtain third pointcloud data and identification information, the third point cloud databeing generated by combining first point cloud data and second pointcloud data, the identification information indicating whether each ofthree-dimensional points included in the third point cloud data belongsto the first point cloud data or the second point cloud data; andseparating the first point cloud data and the second point cloud datafrom the third point cloud data, using the identification information.

The present disclosure can provide a three-dimensional data encodingmethod, a three-dimensional data decoding method, a three-dimensionaldata encoding device, or a three-dimensional data decoding device thatis capable of improving coding efficiency.

BRIEF DESCRIPTION OF DRAWINGS

These and other objects, advantages and features of the disclosure willbecome apparent from the following description thereof taken inconjunction with the accompanying drawings that illustrate a specificembodiment of the present disclosure.

FIG. 1 is a diagram illustrating a configuration of a three-dimensionaldata encoding and decoding system according to Embodiment 1;

FIG. 2 is a diagram illustrating a structure example of point cloud dataaccording to Embodiment 1;

FIG. 3 is a diagram illustrating a structure example of a data fileindicating the point cloud data according to Embodiment 1;

FIG. 4 is a diagram illustrating types of the point cloud data accordingto Embodiment 1;

FIG. 5 is a diagram illustrating a structure of a first encoderaccording to Embodiment 1;

FIG. 6 is a block diagram illustrating the first encoder according toEmbodiment 1;

FIG. 7 is a diagram illustrating a structure of a first decoderaccording to Embodiment 1;

FIG. 8 is a block diagram illustrating the first decoder according toEmbodiment 1;

FIG. 9 is a diagram illustrating a structure of a second encoderaccording to Embodiment 1;

FIG. 10 is a block diagram illustrating the second encoder according toEmbodiment 1;

FIG. 11 is a diagram illustrating a structure of a second decoderaccording to Embodiment 1;

FIG. 12 is a block diagram illustrating the second decoder according toEmbodiment 1;

FIG. 13 is a diagram illustrating a protocol stack related to PCCencoded data according to Embodiment 1;

FIG. 14 is a diagram illustrating a basic structure of ISOBMFF accordingto Embodiment 2;

FIG. 15 is a diagram illustrating a protocol stack according toEmbodiment 2;

FIG. 16 is a diagram illustrating structures of an encoder and amultiplexer according to Embodiment 3;

FIG. 17 is a diagram illustrating a structure example of encoded dataaccording to Embodiment 3;

FIG. 18 is a diagram illustrating a structure example of encoded dataand a NAL unit according to Embodiment 3;

FIG. 19 is a diagram illustrating a semantics example ofpcc_nal_unit_type according to Embodiment 3;

FIG. 20 is a diagram illustrating an example of a transmitting order ofNAL units according to Embodiment 3;

FIG. 21 is a block diagram showing a first encoder according toEmbodiment 4;

FIG. 22 is a block diagram showing a first decoder according toEmbodiment 4;

FIG. 23 is a block diagram showing a divider according to Embodiment 4;

FIG. 24 is a diagram illustrating an example of dividing slices andtiles according to Embodiment 4;

FIG. 25 is a diagram illustrating dividing pattern examples of slicesand tiles according to Embodiment 4;

FIG. 26 is a diagram illustrating an example of dependency relationshipsaccording to Embodiment 4;

FIG. 27 is a diagram illustrating an example of decoding order of dataaccording to Embodiment 4;

FIG. 28 is a flowchart of an encoding process according to Embodiment 4;

FIG. 29 is a block diagram of a combiner according to Embodiment 4;

FIG. 30 is a diagram illustrating a structural example of encoded dataand NAL units according to Embodiment 4;

FIG. 31 is a flowchart of an encoding process according to Embodiment 4;

FIG. 32 is a flowchart of a decoding process according to Embodiment 4;

FIG. 33 is a flowchart of an encoding process according to Embodiment 4;

FIG. 34 is a flowchart of a decoding process according to Embodiment 4;

FIG. 35 is a diagram showing a concept of generation of a tree structureand an occupancy code from point cloud data of a plurality of framesaccording to Embodiment 5;

FIG. 36 is a diagram showing an example of frame combining according toEmbodiment 5;

FIG. 37 is a diagram showing an example of combining of a plurality offrames according to Embodiment 5;

FIG. 38 is a flowchart of a three-dimensional data encoding processaccording to Embodiment 5;

FIG. 39 is a flowchart of an encoding process according to Embodiment 5;

FIG. 40 is a flowchart of a three-dimensional data decoding processaccording to Embodiment 5;

FIG. 41 is a flowchart of a decoding and dividing process according toEmbodiment 5;

FIG. 42 is a block diagram showing an encoder according to Embodiment 5;

FIG. 43 is a block diagram showing a divider according to Embodiment 5;

FIG. 44 is a block diagram showing a geometry information encoderaccording to Embodiment 5;

FIG. 45 is a block diagram showing an attribute information encoderaccording to Embodiment 5;

FIG. 46 is a flowchart of a process of encoding point cloud dataaccording to Embodiment 5;

FIG. 47 is a flowchart of an encoding process according to Embodiment 5;

FIG. 48 is a block diagram showing a decoder according to Embodiment 5;

FIG. 49 is a block diagram showing a geometry information decoderaccording to Embodiment 5;

FIG. 50 is a block diagram showing an attribute information decoderaccording to Embodiment 5;

FIG. 51 is a block diagram showing a combiner according to Embodiment 5;

FIG. 52 is a flowchart of a process of decoding point cloud dataaccording to Embodiment 5;

FIG. 53 is a flowchart of a decoding process according to Embodiment 5;

FIG. 54 is a diagram showing an example of pattern of frame combiningaccording to Embodiment 5;

FIG. 55 is a diagram showing a configuration example of PCC framesaccording to Embodiment 5;

FIG. 56 is a diagram showing a configuration of encoded geometryinformation according to Embodiment 5;

FIG. 57 is a diagram showing a syntax example of a header of encodedgeometry information according to Embodiment 5;

FIG. 58 is a diagram showing a syntax example of a payload of encodedgeometry information according to Embodiment 5;

FIG. 59 is a diagram showing an example of leaf node informationaccording to Embodiment 5;

FIG. 60 is a diagram showing an example of the leaf node informationaccording to Embodiment 5;

FIG. 61 is a diagram showing an example of bit map information accordingto Embodiment 5;

FIG. 62 is a diagram showing a configuration of encoded attributeinformation according to Embodiment 5;

FIG. 63 is a diagram showing a syntax example of a header of encodedattribute information according to Embodiment 5;

FIG. 64 is a diagram showing a syntax example of a payload of encodedattribute information according to Embodiment 5;

FIG. 65 is a diagram showing a configuration of encoded data accordingto Embodiment 5;

FIG. 66 is a diagram showing an order of transmission and a datareference relationship according to Embodiment 5;

FIG. 67 is a diagram showing an order of transmission and a datareference relationship according to Embodiment 5;

FIG. 68 is a diagram showing an order of transmission and a datareference relationship according to Embodiment 5;

FIG. 69 is a diagram showing an example in which part of frames isdecoded according to Embodiment 5;

FIG. 70 is a diagram showing an order of transmission and a datareference relationship according to Embodiment 5;

FIG. 71 is a diagram showing an order of transmission and a datareference relationship according to Embodiment 5;

FIG. 72 is a diagram showing an order of transmission and a datareference relationship according to Embodiment 5;

FIG. 73 is a diagram showing an order of transmission and a datareference relationship according to Embodiment 5;

FIG. 74 is a flowchart of an encoding process according to Embodiment 5;and

FIG. 75 is a flowchart of a decoding process according to Embodiment 5.

DETAILED DESCRIPTION OF THE EMBODIMENTS

A three-dimensional data encoding method according to an aspect of thepresent disclosure includes: combining first point cloud data and secondpoint cloud data to generate third point cloud data; and encoding thethird point cloud data to generate encoded data, wherein the encodeddata includes identification information indicating whether each ofthree-dimensional points included in the third point cloud data belongsto the first point cloud data or the second point cloud data.

Accordingly, the three-dimensional data encoding method can improvecoding efficiency by encoding pieces of point cloud data all at once.

For example, the first point cloud data and the second point cloud datamay be point cloud data having different times.

For example, the first point cloud data and the second point cloud datamay be point cloud data of a same object and having different times.

For example, the encoded data may include geometry information andattribute information of each of the three-dimensional points includedin the third point cloud data, and the identification information may beincluded in the attribute information.

For example, the encoded data may include geometry information in whicha position of each of the three-dimensional points included in the thirdpoint cloud data is expressed using an N-ary tree, N being an integergreater than or equal to 2.

A three-dimensional data decoding method according to an aspect of thepresent disclosure includes: decoding encoded data to obtain third pointcloud data and identification information, the third point cloud databeing generated by combining first point cloud data and second pointcloud data, the identification information indicating whether each ofthree-dimensional points included in the third point cloud data belongsto the first point cloud data or the second point cloud data; andseparating the first point cloud data and the second point cloud datafrom the third point cloud data, using the identification information.

Accordingly, the three-dimensional data decoding method can decodeencoded data the coding efficiency of which has been improved by havingpieces of point cloud data encoded all at once.

For example, the first point cloud data and the second point cloud datamay be point cloud data having different times.

For example, the first point cloud data and the second point cloud datamay be point cloud data of a same subject and having different times.

For example, the encoded data may include geometry information andattribute information of each of the three-dimensional points includedin the third point cloud data, and the identification information may beincluded in the attribute information.

For example, the encoded data may include geometry information in whicha position of each of the three-dimensional points included in the thirdpoint cloud data is expressed using an N-ary tree, N being an integergreater than or equal to 2.

A three-dimensional data encoding device according to an aspect of thepresent disclosure includes: a processor; and memory, wherein, using thememory, the processor: combines first point cloud data and second pointcloud data to generate third point cloud data; and encodes the thirdpoint cloud data to generate encoded data, wherein the encoded dataincludes identification information indicating whether each ofthree-dimensional points included in the third point cloud data belongsto the first point cloud data or the second point cloud data.

Accordingly, the three-dimensional data encoding device can improvecoding efficiency by encoding pieces or point cloud data all at once.

A three-dimensional data decoding device according to an aspect of thepresent disclosure includes: a processor; and memory, wherein, using thememory, the processor: decodes encoded data to obtain third point clouddata and identification information, the third point cloud data beinggenerated by combining first point cloud data and second point clouddata, the identification information indicating whether each ofthree-dimensional points included in the third point cloud data belongsto the first point cloud data or the second point cloud data; andseparates the first point cloud data and the second point cloud datafrom the third point cloud data, using the identification information.

Accordingly, the three-dimensional data decoding device can decodeencoded data the coding efficiency of which has been improved by havingpieces of point cloud data encoded all at once.

Note that these general or specific aspects may be implemented as asystem, a method, an integrated circuit, a computer program, or acomputer-readable recording medium such as a CD-ROM, or may beimplemented as any combination of a system, a method, an integratedcircuit, a computer program, and a recording medium.

The following describes embodiments with reference to the drawings. Notethat the following embodiments show exemplary embodiments of the presentdisclosure. The numerical values, shapes, materials, structuralcomponents, the arrangement and connection of the structural components,steps, the processing order of the steps, etc. shown in the followingembodiments are mere examples, and thus are not intended to limit thepresent disclosure. Of the structural components described in thefollowing embodiments, structural components not recited in any one ofthe independent claims that indicate the broadest concepts will bedescribed as optional structural components.

Embodiment 1

When using encoded data of a point cloud in a device or for a service inpractice, required information for the application is desirablytransmitted and received in order to reduce the network bandwidth.However, conventional encoding structures for three-dimensional datahave no such a function, and there is also no encoding method for such afunction.

Embodiment 1 described below relates to a three-dimensional dataencoding method and a three-dimensional data encoding device for encodeddata of a three-dimensional point cloud that provides a function oftransmitting and receiving required information for an application, athree-dimensional data decoding method and a three-dimensional datadecoding device for decoding the encoded data, a three-dimensional datamultiplexing method for multiplexing the encoded data, and athree-dimensional data transmission method for transmitting the encodeddata.

In particular, at present, a first encoding method and a second encodingmethod are under investigation as encoding methods (encoding schemes)for point cloud data. However, there is no method defined for storingthe configuration of encoded data and the encoded data in a systemformat. Thus, there is a problem that an encoder cannot perform an MUXprocess (multiplexing), transmission, or accumulation of data.

In addition, there is no method for supporting a format that involvestwo codecs, the first encoding method and the second encoding method,such as point cloud compression (PCC).

With regard to this embodiment, a configuration of PCC-encoded data thatinvolves two codecs, a first encoding method and a second encodingmethod, and a method of storing the encoded data in a system format willbe described.

A configuration of a three-dimensional data (point cloud data) encodingand decoding system according to this embodiment will be firstdescribed. FIG. 1 is a diagram showing an example of a configuration ofthe three-dimensional data encoding and decoding system according tothis embodiment. As shown in FIG. 1, the three-dimensional data encodingand decoding system includes three-dimensional data encoding system4601, three-dimensional data decoding system 4602, sensor terminal 4603,and external connector 4604.

Three-dimensional data encoding system 4601 generates encoded data ormultiplexed data by encoding point cloud data, which isthree-dimensional data. Three-dimensional data encoding system 4601 maybe a three-dimensional data encoding device implemented by a singledevice or a system implemented by a plurality of devices. Thethree-dimensional data encoding device may include a part of a pluralityof processors included in three-dimensional data encoding system 4601.

Three-dimensional data encoding system 4601 includes point cloud datageneration system 4611, presenter 4612, encoder 4613, multiplexer 4614,input/output unit 4615, and controller 4616. Point cloud data generationsystem 4611 includes sensor information obtainer 4617, and point clouddata generator 4618.

Sensor information obtainer 4617 obtains sensor information from sensorterminal 4603, and outputs the sensor information to point cloud datagenerator 4618. Point cloud data generator 4618 generates point clouddata from the sensor information, and outputs the point cloud data toencoder 4613.

Presenter 4612 presents the sensor information or point cloud data to auser. For example, presenter 4612 displays information or an image basedon the sensor information or point cloud data.

Encoder 4613 encodes (compresses) the point cloud data, and outputs theresulting encoded data, control information (signaling information)obtained in the course of the encoding, and other additional informationto multiplexer 4614. The additional information includes the sensorinformation, for example.

Multiplexer 4614 generates multiplexed data by multiplexing the encodeddata, the control information, and the additional information inputthereto from encoder 4613. A format of the multiplexed data is a fileformat for accumulation or a packet format for transmission, forexample.

Input/output unit 4615 (a communication unit or interface, for example)outputs the multiplexed data to the outside. Alternatively, themultiplexed data may be accumulated in an accumulator, such as aninternal memory. Controller 4616 (or an application executor) controlseach processor. That is, controller 4616 controls the encoding, themultiplexing, or other processing.

Note that the sensor information may be input to encoder 4613 ormultiplexer 4614. Alternatively, input/output unit 4615 may output thepoint cloud data or encoded data to the outside as it is.

A transmission signal (multiplexed data) output from three-dimensionaldata encoding system 4601 is input to three-dimensional data decodingsystem 4602 via external connector 4604.

Three-dimensional data decoding system 4602 generates point cloud data,which is three-dimensional data, by decoding the encoded data ormultiplexed data. Note that three-dimensional data decoding system 4602may be a three-dimensional data decoding device implemented by a singledevice or a system implemented by a plurality of devices. Thethree-dimensional data decoding device may include a part of a pluralityof processors included in three-dimensional data decoding system 4602.

Three-dimensional data decoding system 4602 includes sensor informationobtainer 4621, input/output unit 4622, demultiplexer 4623, decoder 4624,presenter 4625, user interface 4626, and controller 4627.

Sensor information obtainer 4621 obtains sensor information from sensorterminal 4603.

Input/output unit 4622 obtains the transmission signal, decodes thetransmission signal into the multiplexed data (file format or packet),and outputs the multiplexed data to demultiplexer 4623.

Demultiplexer 4623 obtains the encoded data, the control information,and the additional information from the multiplexed data, and outputsthe encoded data, the control information, and the additionalinformation to decoder 4624.

Decoder 4624 reconstructs the point cloud data by decoding the encodeddata.

Presenter 4625 presents the point cloud data to a user. For example,presenter 4625 displays information or an image based on the point clouddata. User interface 4626 obtains an indication based on a manipulationby the user. Controller 4627 (or an application executor) controls eachprocessor. That is, controller 4627 controls the demultiplexing, thedecoding, the presentation, or other processing.

Note that input/output unit 4622 may obtain the point cloud data orencoded data as it is from the outside. Presenter 4625 may obtainadditional information, such as sensor information, and presentinformation based on the additional information. Presenter 4625 mayperform a presentation based on an indication from a user obtained onuser interface 4626.

Sensor terminal 4603 generates sensor information, which is informationobtained by a sensor. Sensor terminal 4603 is a terminal provided with asensor or a camera. For example, sensor terminal 4603 is a mobile body,such as an automobile, a flying object, such as an aircraft, a mobileterminal, or a camera.

Sensor information that can be generated by sensor terminal 4603includes (1) the distance between sensor terminal 4603 and an object orthe reflectance of the object obtained by LIDAR, a millimeter waveradar, or an infrared sensor or (2) the distance between a camera and anobject or the reflectance of the object obtained by a plurality ofmonocular camera images or a stereo-camera image, for example. Thesensor information may include the posture, orientation, gyro (angularvelocity), position (GPS information or altitude), velocity, oracceleration of the sensor, for example. The sensor information mayinclude air temperature, air pressure, air humidity, or magnetism, forexample.

External connector 4604 is implemented by an integrated circuit (LSI orIC), an external accumulator, communication with a cloud server via theInternet, or broadcasting, for example.

Next, point cloud data will be described. FIG. 2 is a diagram showing aconfiguration of point cloud data. FIG. 3 is a diagram showing aconfiguration example of a data file describing information of the pointcloud data.

Point cloud data includes data on a plurality of points. Data on eachpoint includes geometry information (three-dimensional coordinates) andattribute information associated with the geometry information. A set ofa plurality of such points is referred to as a point cloud. For example,a point cloud indicates a three-dimensional shape of an object.

Geometry information (position), such as three-dimensional coordinates,may be referred to as geometry. Data on each point may include attributeinformation (attribute) on a plurality of types of attributes. A type ofattribute is color or reflectance, for example.

One piece of attribute information may be associated with one piece ofgeometry information, or attribute information on a plurality ofdifferent types of attributes may be associated with one piece ofgeometry information. Alternatively, a plurality of pieces of attributeinformation on the same type of attribute may be associated with onepiece of geometry information.

The configuration example of a data file shown in FIG. 3 is an examplein which geometry information and attribute information are associatedwith each other in a one-to-one relationship, and geometry informationand attribute information on N points forming point cloud data areshown.

The geometry information is information on three axes, specifically, anx-axis, a y-axis, and a z-axis, for example. The attribute informationis RGB color information, for example. A representative data file is plyfile, for example.

Next, types of point cloud data will be described. FIG. 4 is a diagramshowing types of point cloud data. As shown in FIG. 4, point cloud dataincludes a static object and a dynamic object.

The static object is three-dimensional point cloud data at an arbitrarytime (a time point). The dynamic object is three-dimensional point clouddata that varies with time. In the following, three-dimensional pointcloud data associated with a time point will be referred to as a PCCframe or a frame.

The object may be a point cloud whose range is limited to some extent,such as ordinary video data, or may be a large point cloud whose rangeis not limited, such as map information.

There are point cloud data having varying densities. There may be sparsepoint cloud data and dense point cloud data.

In the following, each processor will be described in detail. Sensorinformation is obtained by various means, including a distance sensorsuch as LIDAR or a range finder, a stereo camera, or a combination of aplurality of monocular cameras. Point cloud data generator 4618generates point cloud data based on the sensor information obtained bysensor information obtainer 4617. Point cloud data generator 4618generates geometry information as point cloud data, and adds attributeinformation associated with the geometry information to the geometryinformation.

When generating geometry information or adding attribute information,point cloud data generator 4618 may process the point cloud data. Forexample, point cloud data generator 4618 may reduce the data amount byomitting a point cloud whose position coincides with the position ofanother point cloud. Point cloud data generator 4618 may also convertthe geometry information (such as shifting, rotating or normalizing theposition) or render the attribute information.

Note that, although FIG. 1 shows point cloud data generation system 4611as being included in three-dimensional data encoding system 4601, pointcloud data generation system 4611 may be independently provided outsidethree-dimensional data encoding system 4601.

Encoder 4613 generates encoded data by encoding point cloud dataaccording to an encoding method previously defined. In general, thereare the two types of encoding methods described below. One is anencoding method using geometry information, which will be referred to asa first encoding method, hereinafter. The other is an encoding methodusing a video codec, which will be referred to as a second encodingmethod, hereinafter.

Decoder 4624 decodes the encoded data into the point cloud data usingthe encoding method previously defined.

Multiplexer 4614 generates multiplexed data by multiplexing the encodeddata in an existing multiplexing method. The generated multiplexed datais transmitted or accumulated. Multiplexer 4614 multiplexes not only thePCC-encoded data but also another medium, such as a video, an audio,subtitles, an application, or a file, or reference time information.Multiplexer 4614 may further multiplex attribute information associatedwith sensor information or point cloud data.

Multiplexing schemes or file formats include ISOBMFF, MPEG-DASH, whichis a transmission scheme based on ISOBMFF, MMT, MPEG-2 TS Systems, orRMP, for example.

Demultiplexer 4623 extracts PCC-encoded data, other media, timeinformation and the like from the multiplexed data.

Input/output unit 4615 transmits the multiplexed data in a methodsuitable for the transmission medium or accumulation medium, such asbroadcasting or communication. Input/output unit 4615 may communicatewith another device over the Internet or communicate with anaccumulator, such as a cloud server.

As a communication protocol, http, ftp, TCP, UDP or the like is used.The pull communication scheme or the push communication scheme can beused.

A wired transmission or a wireless transmission can be used. For thewired transmission, Ethernet (registered trademark), USB, RS-232C, HDMI(registered trademark), or a coaxial cable is used, for example. For thewireless transmission, wireless LAN, Wi-Fi (registered trademark),Bluetooth (registered trademark), or a millimeter wave is used, forexample.

As a broadcasting scheme, DVB-T2, DVB-S2, DVB-C2, ATSC3.0, or ISDB-S3 isused, for example.

FIG. 5 is a diagram showing a configuration of first encoder 4630, whichis an example of encoder 4613 that performs encoding in the firstencoding method. FIG. 6 is a block diagram showing first encoder 4630.First encoder 4630 generates encoded data (encoded stream) by encodingpoint cloud data in the first encoding method. First encoder 4630includes geometry information encoder 4631, attribute informationencoder 4632, additional information encoder 4633, and multiplexer 4634.

First encoder 4630 is characterized by performing encoding by keeping athree-dimensional structure in mind. First encoder 4630 is furthercharacterized in that attribute information encoder 4632 performsencoding using information obtained from geometry information encoder4631. The first encoding method is referred to also as geometry-basedPCC (GPCC).

Point cloud data is PCC point cloud data like a PLY file or PCC pointcloud data generated from sensor information, and includes geometryinformation (position), attribute information (attribute), and otheradditional information (metadata). The geometry information is input togeometry information encoder 4631, the attribute information is input toattribute information encoder 4632, and the additional information isinput to additional information encoder 4633.

Geometry information encoder 4631 generates encoded geometry information(compressed geometry), which is encoded data, by encoding geometryinformation. For example, geometry information encoder 4631 encodesgeometry information using an N-ary tree structure, such as an octree.Specifically, in the case of an octree, a current space is divided intoeight nodes (subspaces), 8-bit information (occupancy code) thatindicates whether each node includes a point cloud or not is generated.A node including a point cloud is further divided into eight nodes, and8-bit information that indicates whether each of the eight nodesincludes a point cloud or not is generated. This process is repeateduntil a predetermined level is reached or the number of the point cloudsincluded in each node becomes equal to or less than a threshold.

Attribute information encoder 4632 generates encoded attributeinformation (compressed attribute), which is encoded data, by encodingattribute information using configuration information generated bygeometry information encoder 4631. For example, attribute informationencoder 4632 determines a reference point (reference node) that is to bereferred to in encoding a current point (current node) to be processedbased on the octree structure generated by geometry information encoder4631. For example, attribute information encoder 4632 refers to a nodewhose parent node in the octree is the same as the parent node of thecurrent node, of peripheral nodes or neighboring nodes. Note that themethod of determining a reference relationship is not limited to thismethod.

The process of encoding attribute information may include at least oneof a quantization process, a prediction process, and an arithmeticencoding process. In this case, “refer to” means using a reference nodefor calculating a predicted value of attribute information or using astate of a reference node (occupancy information that indicates whethera reference node includes a point cloud or not, for example) fordetermining a parameter of encoding. For example, the parameter ofencoding is a quantization parameter in the quantization process or acontext or the like in the arithmetic encoding.

Additional information encoder 4633 generates encoded additionalinformation (compressed metadata), which is encoded data, by encodingcompressible data of additional information.

Multiplexer 4634 generates encoded stream (compressed stream), which isencoded data, by multiplexing encoded geometry information, encodedattribute information, encoded additional information, and otheradditional information. The generated encoded stream is output to aprocessor in a system layer (not shown).

Next, first decoder 4640, which is an example of decoder 4624 thatperforms decoding in the first encoding method, will be described. FIG.7 is a diagram showing a configuration of first decoder 4640. FIG. 8 isa block diagram showing first decoder 4640. First decoder 4640 generatespoint cloud data by decoding encoded data (encoded stream) encoded inthe first encoding method in the first encoding method. First decoder4640 includes demultiplexer 4641, geometry information decoder 4642,attribute information decoder 4643, and additional information decoder4644.

An encoded stream (compressed stream), which is encoded data, is inputto first decoder 4640 from a processor in a system layer (not shown).

Demultiplexer 4641 separates encoded geometry information (compressedgeometry), encoded attribute information (compressed attribute), encodedadditional information (compressed metadata), and other additionalinformation from the encoded data.

Geometry information decoder 4642 generates geometry information bydecoding the encoded geometry information. For example, geometryinformation decoder 4642 restores the geometry information on a pointcloud represented by three-dimensional coordinates from encoded geometryinformation represented by an N-ary structure, such as an octree.

Attribute information decoder 4643 decodes the encoded attributeinformation based on configuration information generated by geometryinformation decoder 4642. For example, attribute information decoder4643 determines a reference point (reference node) that is to bereferred to in decoding a current point (current node) to be processedbased on the octree structure generated by geometry information decoder4642. For example, attribute information decoder 4643 refers to a nodewhose parent node in the octree is the same as the parent node of thecurrent node, of peripheral nodes or neighboring nodes. Note that themethod of determining a reference relationship is not limited to thismethod.

The process of decoding attribute information may include at least oneof an inverse quantization process, a prediction process, and anarithmetic decoding process. In this case, “refer to” means using areference node for calculating a predicted value of attributeinformation or using a state of a reference node (occupancy informationthat indicates whether a reference node includes a point cloud or not,for example) for determining a parameter of decoding. For example, theparameter of decoding is a quantization parameter in the inversequantization process or a context or the like in the arithmeticdecoding.

Additional information decoder 4644 generates additional information bydecoding the encoded additional information. First decoder 4640 usesadditional information required for the decoding process for thegeometry information and the attribute information in the decoding, andoutputs additional information required for an application to theoutside.

Next, second encoder 4650, which is an example of encoder 4613 thatperforms encoding in the second encoding method, will be described. FIG.9 is a diagram showing a configuration of second encoder 4650. FIG. 10is a block diagram showing second encoder 4650.

Second encoder 4650 generates encoded data (encoded stream) by encodingpoint cloud data in the second encoding method. Second encoder 4650includes additional information generator 4651, geometry image generator4652, attribute image generator 4653, video encoder 4654, additionalinformation encoder 4655, and multiplexer 4656.

Second encoder 4650 is characterized by generating a geometry image andan attribute image by projecting a three-dimensional structure onto atwo-dimensional image, and encoding the generated geometry image andattribute image in an existing video encoding scheme. The secondencoding method is referred to as video-based PCC (VPCC).

Point cloud data is PCC point cloud data like a PLY file or PCC pointcloud data generated from sensor information, and includes geometryinformation (position), attribute information (attribute), and otheradditional information (metadata).

Additional information generator 4651 generates map information on aplurality of two-dimensional images by projecting a three-dimensionalstructure onto a two-dimensional image.

Geometry image generator 4652 generates a geometry image based on thegeometry information and the map information generated by additionalinformation generator 4651. The geometry image is a distance image inwhich distance (depth) is indicated as a pixel value, for example. Thedistance image may be an image of a plurality of point clouds viewedfrom one point of view (an image of a plurality of point cloudsprojected onto one two-dimensional plane), a plurality of images of aplurality of point clouds viewed from a plurality of points of view, ora single image integrating the plurality of images.

Attribute image generator 4653 generates an attribute image based on theattribute information and the map information generated by additionalinformation generator 4651. The attribute image is an image in whichattribute information (color (RGB), for example) is indicated as a pixelvalue, for example. The image may be an image of a plurality of pointclouds viewed from one point of view (an image of a plurality of pointclouds projected onto one two-dimensional plane), a plurality of imagesof a plurality of point clouds viewed from a plurality of points ofview, or a single image integrating the plurality of images.

Video encoder 4654 generates an encoded geometry image (compressedgeometry image) and an encoded attribute image (compressed attributeimage), which are encoded data, by encoding the geometry image and theattribute image in a video encoding scheme. Note that, as the videoencoding scheme, any well-known encoding method can be used. Forexample, the video encoding scheme is AVC or HEVC.

Additional information encoder 4655 generates encoded additionalinformation (compressed metadata) by encoding the additionalinformation, the map information and the like included in the pointcloud data.

Multiplexer 4656 generates an encoded stream (compressed stream), whichis encoded data, by multiplexing the encoded geometry image, the encodedattribute image, the encoded additional information, and otheradditional information. The generated encoded stream is output to aprocessor in a system layer (not shown).

Next, second decoder 4660, which is an example of decoder 4624 thatperforms decoding in the second encoding method, will be described. FIG.11 is a diagram showing a configuration of second decoder 4660. FIG. 12is a block diagram showing second decoder 4660. Second decoder 4660generates point cloud data by decoding encoded data (encoded stream)encoded in the second encoding method in the second encoding method.Second decoder 4660 includes demultiplexer 4661, video decoder 4662,additional information decoder 4663, geometry information generator4664, and attribute information generator 4665.

An encoded stream (compressed stream), which is encoded data, is inputto second decoder 4660 from a processor in a system layer (not shown).

Demultiplexer 4661 separates an encoded geometry image (compressedgeometry image), an encoded attribute image (compressed attributeimage), an encoded additional information (compressed metadata), andother additional information from the encoded data.

Video decoder 4662 generates a geometry image and an attribute image bydecoding the encoded geometry image and the encoded attribute image in avideo encoding scheme. Note that, as the video encoding scheme, anywell-known encoding method can be used. For example, the video encodingscheme is AVC or HEVC.

Additional information decoder 4663 generates additional informationincluding map information or the like by decoding the encoded additionalinformation.

Geometry information generator 4664 generates geometry information fromthe geometry image and the map information. Attribute informationgenerator 4665 generates attribute information from the attribute imageand the map information.

Second decoder 4660 uses additional information required for decoding inthe decoding, and outputs additional information required for anapplication to the outside.

In the following, a problem with the PCC encoding scheme will bedescribed. FIG. 13 is a diagram showing a protocol stack relating toPCC-encoded data. FIG. 13 shows an example in which PCC-encoded data ismultiplexed with other medium data, such as a video (HEVC, for example)or an audio, and transmitted or accumulated.

A multiplexing scheme and a file format have a function of multiplexingvarious encoded data and transmitting or accumulating the data. Totransmit or accumulate encoded data, the encoded data has to beconverted into a format for the multiplexing scheme. For example, withHEVC, a technique for storing encoded data in a data structure referredto as a NAL unit and storing the NAL unit in ISOBMFF is prescribed.

At present, a first encoding method (Codec1) and a second encodingmethod (Codec2) are under investigation as encoding methods for pointcloud data. However, there is no method defined for storing theconfiguration of encoded data and the encoded data in a system format.Thus, there is a problem that an encoder cannot perform an MUX process(multiplexing), transmission, or accumulation of data.

Note that, in the following, the term “encoding method” means any of thefirst encoding method and the second encoding method unless a particularencoding method is specified.

Embodiment 2

In Embodiment 2, a method of storing the NAL unit in an ISOBMFF filewill be described.

ISOBMFF is a file format standard prescribed in ISO/EC14496-12. ISOBMFFis a standard that does not depend on any medium, and prescribes aformat that allows various media, such as a video, an audio, and a text,to be multiplexed and stored.

A basic structure (file) of ISOBMFF will be described. A basic unit ofISOBMFF is a box. A box is formed by type, length, and data, and a fileis a set of various types of boxes.

FIG. 14 is a diagram showing a basic structure (file) of ISOBMFF. A filein ISOBMFF includes boxes, such as ftyp that indicates the brand of thefile by four-character code (4CC), moov that stores metadata, such ascontrol information (signaling information), and mdat that stores data.

A method for storing each medium in the ISOBMFF file is separatelyprescribed. For example, a method of storing an AVC video or an HEVCvideo is prescribed in ISO/IEC14496-15. Here, it can be contemplated toexpand the functionality of ISOBMFF and use ISOBMFF to accumulate ortransmit PCC-encoded data. However, there has been no convention forstoring PCC-encoded data in an ISOBMFF file. In this embodiment, amethod of storing PCC-encoded data in an ISOBMFF file will be described.

FIG. 15 is a diagram showing a protocol stack in a case where a commonPCC codec NAL unit in an ISOBMFF file. Here, a common PCC codec NAL unitis stored in an ISOBMFF file. Although the NAL unit is common to PCCcodecs, a storage method for each codec (Carriage of Codec, Carriage ofCodec2) is desirably prescribed, since a plurality of PCC codecs arestored in the NAL unit.

Embodiment 3

In this embodiment, types of the encoded data (geometry information(geometry), attribute information (attribute), and additionalinformation (metadata)) generated by first encoder 4630 or secondencoder 4650 described above, a method of generating additionalinformation (metadata), and a multiplexing process in the multiplexerwill be described. The additional information (metadata) may be referredto as a parameter set or control information (signaling information).

In this embodiment, the dynamic object (three-dimensional point clouddata that varies with time) described above with reference to FIG. 4will be described, for example. However, the same method can also beused for the static object (three-dimensional point cloud dataassociated with an arbitrary time point).

FIG. 16 is a diagram showing configurations of encoder 4801 andmultiplexer 4802 in a three-dimensional data encoding device accordingto this embodiment. Encoder 4801 corresponds to first encoder 4630 orsecond encoder 4650 described above, for example. Multiplexer 4802corresponds to multiplexer 4634 or 4656 described above.

Encoder 4801 encodes a plurality of PCC (point cloud compression) framesof point cloud data to generate a plurality of pieces of encoded data(multiple compressed data) of geometry information, attributeinformation, and additional information.

Multiplexer 4802 integrates a plurality of types of data (geometryinformation, attribute information, and additional information) into aNAL unit, thereby converting the data into a data configuration thattakes data access in the decoding device into consideration.

FIG. 17 is a diagram showing a configuration example of the encoded datagenerated by encoder 4801. Arrows in the drawing indicate a dependenceinvolved in decoding of the encoded data. The source of an arrow dependson data of the destination of the arrow. That is, the decoding devicedecodes the data of the destination of an arrow, and decodes the data ofthe source of the arrow using the decoded data. In other words, “a firstentity depends on a second entity” means that data of the second entityis referred to (used) in processing (encoding, decoding, or the like) ofdata of the first entity.

First, a process of generating encoded data of geometry information willbe described. Encoder 4801 encodes geometry information of each frame togenerate encoded geometry data (compressed geometry data) for eachframe. The encoded geometry data is denoted by G(i). i denotes a framenumber or a time point of a frame, for example.

Furthermore, encoder 4801 generates a geometry parameter set (GPS(i))for each frame. The geometry parameter set includes a parameter that canbe used for decoding of the encoded geometry data. The encoded geometrydata for each frame depends on an associated geometry parameter set.

The encoded geometry data formed by a plurality of frames is defined asa geometry sequence. Encoder 4801 generates a geometry sequenceparameter set (referred to also as geometry sequence PS or geometry SPS)that stores a parameter commonly used for a decoding process for theplurality of frames in the geometry sequence. The geometry sequencedepends on the geometry SPS.

Next, a process of generating encoded data of attribute information willbe described. Encoder 4801 encodes attribute information of each frameto generate encoded attribute data (compressed attribute data) for eachframe. The encoded attribute data is denoted by A(i). FIG. 17 shows anexample in which there are attribute X and attribute Y, and encodedattribute data for attribute X is denoted by AX(i), and encodedattribute data for attribute Y is denoted by AY(i).

Furthermore, encoder 4801 generates an attribute parameter set (APS(i))for each frame. The attribute parameter set for attribute X is denotedby AXPS(i), and the attribute parameter set for attribute Y is denotedby AYPS(i). The attribute parameter set includes a parameter that can beused for decoding of the encoded attribute information. The encodedattribute data depends on an associated attribute parameter set.

The encoded attribute data formed by a plurality of frames is defined asan attribute sequence. Encoder 4801 generates an attribute sequenceparameter set (referred to also as attribute sequence PS or attributeSPS) that stores a parameter commonly used for a decoding process forthe plurality of frames in the attribute sequence. The attributesequence depends on the attribute SPS.

In the first encoding method, the encoded attribute data depends on theencoded geometry data.

FIG. 17 shows an example in which there are two types of attributeinformation (attribute X and attribute Y). When there are two types ofattribute information, for example, two encoders generate data andmetadata for the two types of attribute information. For example, anattribute sequence is defined for each type of attribute information,and an attribute SPS is generated for each type of attributeinformation.

Note that, although FIG. 17 shows an example in which there is one typeof geometry information, and there are two types of attributeinformation, the present invention is not limited thereto. There may beone type of attribute information or three or more types of attributeinformation. In such cases, encoded data can be generated in the samemanner. If the point cloud data has no attribute information, there maybe no attribute information. In such a case, encoder 4801 does not haveto generate a parameter set associated with attribute information.

Next, a process of generating encoded data of additional information(metadata) will be described. Encoder 4801 generates a PCC stream PS(referred to also as PCC stream PS or stream PS), which is a parameterset for the entire PCC stream. Encoder 4801 stores a parameter that canbe commonly used for a decoding process for one or more geometrysequences and one or more attribute sequences in the stream PS. Forexample, the stream PS includes identification information indicatingthe codec for the point cloud data and information indicating analgorithm used for the encoding, for example. The geometry sequence andthe attribute sequence depend on the stream PS.

Next, an access unit and a GOF will be described. In this embodiment,concepts of access unit (AU) and group of frames (GOF) are newlyintroduced.

An access unit is a basic unit for accessing data in decoding, and isformed by one or more pieces of data and one or more pieces of metadata.For example, an access unit is formed by geometry information and one ormore pieces of attribute information associated with a same time point.A GOF is a random access unit, and is formed by one or more accessunits.

Encoder 4801 generates an access unit header (AU header) asidentification information indicating the top of an access unit. Encoder4801 stores a parameter relating to the access unit in the access unitheader. For example, the access unit header includes a configuration ofor information on the encoded data included in the access unit. Theaccess unit header further includes a parameter commonly used for thedata included in the access unit, such as a parameter relating todecoding of the encoded data.

Note that encoder 4801 may generate an access unit delimiter thatincludes no parameter relating to the access unit, instead of the accessunit header. The access unit delimiter is used as identificationinformation indicating the top of the access unit. The decoding deviceidentifies the top of the access unit by detecting the access unitheader or the access unit delimiter.

Next, generation of identification information for the top of a GOF willbe described. As identification information indicating the top of a GOF,encoder 4801 generates a GOF header. Encoder 4801 stores a parameterrelating to the GOF in the GOF header. For example, the GOF headerincludes a configuration of or information on the encoded data includedin the GOF. The GOF header further includes a parameter commonly usedfor the data included in the GOF, such as a parameter relating todecoding of the encoded data.

Note that encoder 4801 may generate a GOF delimiter that includes noparameter relating to the GOF, instead of the GOF header. The GOFdelimiter is used as identification information indicating the top ofthe GOF. The decoding device identifies the top of the GOF by detectingthe GOF header or the GOF delimiter.

In the PCC-encoded data, the access unit is defined as a PCC frame unit,for example. The decoding device accesses a PCC frame based on theidentification information for the top of the access unit.

For example, the GOF is defined as one random access unit. The decodingdevice accesses a random access unit based on the identificationinformation for the top of the GOF. For example, if PCC frames areindependent from each other and can be separately decoded, a PCC framecan be defined as a random access unit.

Note that two or more PCC frames may be assigned to one access unit, anda plurality of random access units may be assigned to one GOF.

Encoder 4801 may define and generate a parameter set or metadata otherthan those described above. For example, encoder 4801 may generatesupplemental enhancement information (SEI) that stores a parameter (anoptional parameter) that is not always used for decoding.

Next, a configuration of encoded data and a method of storing encodeddata in a NAL unit will be described.

For example, a data format is defined for each type of encoded data.FIG. 18 is a diagram showing an example of encoded data and a NAL unit.

For example, as shown in FIG. 18, encoded data includes a header and apayload. The encoded data may include length information indicating thelength (data amount) of the encoded data, the header, or the payload.The encoded data may include no header.

The header includes identification information for identifying the data,for example. The identification information indicates a data type or aframe number, for example.

The header includes identification information indicating a referencerelationship, for example. The identification information is stored inthe header when there is a dependence relationship between data, forexample, and allows an entity to refer to another entity. For example,the header of the entity to be referred to includes identificationinformation for identifying the data. The header of the referring entityincludes identification information indicating the entity to be referredto.

Note that, when the entity to be referred to or the referring entity canbe identified or determined from other information, the identificationinformation for identifying the data or identification informationindicating the reference relationship can be omitted.

Multiplexer 4802 stores the encoded data in the payload of the NAL unit.The NAL unit header includes pec_nal_unit_type, which is identificationinformation for the encoded data. FIG. 19 is a diagram showing asemantics example of pec_nal_unit_type.

As shown in FIG. 19, when pec_codec_type is codec 1 (Codec: firstencoding method), values 0 to 10 of pec_nal_unit_type are assigned toencoded geometry data (Geometry), encoded attribute X data (AttributeX),encoded attribute Y data (AttributeY), geometry PS (Geom. PS), attributeXPS (AttrX. S), attribute YPS (AttrY. PS), geometry SPS (GeometrySequence PS), attribute X SPS (AttributeX Sequence PS), attribute Y SPS(AttributeY Sequence PS), AU header (AU Header), and GOF header (GOFHeader) in codec 1. Values of 11 and greater are reservedin codec 1.

When pec_codec_type is codec 2 (Codec2: second encoding method), valuesof 0 to 2 of pcc_nal_unit_type are assigned to data A (DataA), metadataA (MetaDataA), and metadata B (MetaDataB) in the codec. Values of 3 andgreater are reservedin codec2.

Next, an order of transmission of data will be described. In thefollowing, restrictions on the order of transmission of NAL units willbe described.

Multiplexer 4802 transmits NAL units on a GOF basis or on an AU basis.Multiplexer 4802 arranges the GOF header at the top of a GOF, andarranges the AU header at the top of an AU.

In order to allow the decoding device to decode the next AU and thefollowing AUs even when data is lost because of a packet loss or thelike, multiplexer 4802 may arrange a sequence parameter set (SPS) ineach AU.

When there is a dependence relationship for decoding between encodeddata, the decoding device decodes the data of the entity to be referredto and then decodes the data of the referring entity. In order to allowthe decoding device to perform decoding in the order of receptionwithout rearranging the data, multiplexer 4802 first transmits the dataof the entity to be referred to.

FIG. 20 is a diagram showing examples of the order of transmission ofNAL units. FIG. 20 shows three examples, that is, geometryinformation-first order, parameter-first order, and data-integratedorder.

The geometry information-first order of transmission is an example inwhich information relating to geometry information is transmittedtogether, and information relating to attribute information istransmitted together. In the case of this order of transmission, thetransmission of the information relating to the geometry informationends earlier than the transmission of the information relating to theattribute information.

For example, according to this order of transmission is used, when thedecoding device does not decode attribute information, the decodingdevice may be able to have an idle time since the decoding device canomit decoding of attribute information. When the decoding device isrequired to decode geometry information early, the decoding device maybe able to decode geometry information earlier since the decoding deviceobtains encoded data of the geometry information earlier.

Note that, although in FIG. 20 the attribute X SPS and the attribute YSPS are integrated and shown as the attribute SPS, the attribute X SPSand the attribute Y SPS may be separately arranged.

In the parameter set-first order of transmission, a parameter set isfirst transmitted, and data is then transmitted.

As described above, as far as the restrictions on the order oftransmission of NAL units are met, multiplexer 4802 can transmit NALunits in any order. For example, order identification information may bedefined, and multiplexer 4802 may have a function of transmitting NALunits in a plurality of orders. For example, the order identificationinformation for NAL units is stored in the stream PS.

The three-dimensional data decoding device may perform decoding based onthe order identification information. The three-dimensional datadecoding device may indicate a desired order of transmission to thethree-dimensional data encoding device, and the three-dimensional dataencoding device (multiplexer 4802) may control the order of transmissionaccording to the indicated order of transmission.

Note that multiplexer 4802 can generate encoded data having a pluralityof functions merged to each other as in the case of the data-integratedorder of transmission, as far as the restrictions on the order oftransmission are met. For example, as shown in FIG. 20, the GOF headerand the AU header may be integrated, or AXPS and AYPS may be integrated.In such a case, an identifier that indicates data having a plurality offunctions is defined in pcc_nal_unit_type.

In the following, variations of this embodiment will be described. Thereare levels of PSs, such as a frame-level PS, a sequence-level PS, and aPCC sequence-level PS. Provided that the PCC sequence level is a higherlevel, and the frame level is a lower level, parameters can be stored inthe manner described below.

The value of a default PS is indicated in a PS at a higher level. If thevalue of a PS at a lower level differs from the value of the PS at ahigher level, the value of the PS is indicated in the PS at the lowerlevel. Alternatively, the value of the PS is not described in the PS atthe higher level but is described in the PS at the lower level.Alternatively, information indicating whether the value of the PS isindicated in the PS at the lower level, at the higher level, or at boththe levels is indicated in both or one of the PS at the lower level andthe PS at the higher level. Alternatively, the PS at the lower level maybe merged with the PS at the higher level. If the PS at the lower leveland the PS at the higher level overlap with each other, multiplexer 4802may omit transmission of one of the PSs.

Note that encoder 4801 or multiplexer 4802 may divide data into slicesor tiles and transmit each of the divided slices or tiles as divideddata. The divided data includes information for identifying the divideddata, and a parameter used for decoding of the divided data is includedin the parameter set. In this case, an identifier that indicates thatthe data is data relating to a tile or slice or data storing a parameteris defined in pec_nal_unit_type.

Embodiment 4

For HEVC encoding, in order to enable parallel processing in a decodingdevice, there are slice-based or tile-based data division tools, forexample. However, there is no such tool for point cloud compression(PCC) encoding.

In PCC, various data division methods are possible, depending on theparallel processing, the compression efficiency, and the compressionalgorithm. Here, definitions of a slice and a tile, a data structure,and transmission and reception methods will be described.

FIG. 21 is a block diagram showing a configuration of first encoder 4910included in a three-dimensional data encoding device according to thisembodiment. First encoder 4910 generates encoded data (encoded stream)by encoding point cloud data in a first encoding method (geometry-basedPCC (GPCC)). First encoder 4910 includes divider 4911, a plurality ofgeometry information encoders 4912, a plurality of attribute informationencoders 4913, additional information encoder 4914, and multiplexer4915.

Divider 4911 generates a plurality of pieces of divided data by dividingpoint cloud data. Specifically, divider 4911 generates a plurality ofpieces of divided data by dividing a space of point cloud data into aplurality of subspaces. Here, a subspace is a combination of tiles orslices, or a combination of tiles and slices. More specifically, pointcloud data includes geometry information, attribute information, andadditional information. Divider 4911 divides geometry information into aplurality of pieces of divided geometry information, and dividesattribute information into a plurality of pieces of divided attributeinformation. Divider 4911 also generates additional informationconcerning the division.

The plurality of geometry information encoders 4912 generate a pluralityof pieces of encoded geometry information by encoding a plurality ofpieces of divided geometry information. For example, the plurality ofgeometry information encoders 4912 processes a plurality of pieces ofdivided geometry information in parallel.

The plurality of attribute information encoders 4913 generate aplurality of pieces of encoded attribute information by encoding aplurality of pieces of divided attribute information. For example, theplurality of attribute information encoders 4913 process a plurality ofpieces of divided attribute information in parallel.

Additional information encoder 4914 generates encoded additionalinformation by encoding additional information included in the pointcloud data and additional information concerning the data divisiongenerated in the division by divider 4911.

Multiplexer 4915 generates encoded data (encoded stream) by multiplexinga plurality of pieces of encoded geometry information, a plurality ofpieces of encoded attribute information, and encoded additionalinformation, and transmits the generated encoded data. The encodedadditional information is used for decoding.

Note that, although FIG. 21 shows an example in which there are twogeometry information encoders 4912 and two attribute informationencoders 4913, the number of geometry information encoders 4912 and thenumber of attribute information encoders 4913 may be one, or three ormore. The plurality of pieces of divided data may be processed inparallel in the same chip, such as by a plurality of cores of a CPU,processed in parallel by cores of a plurality of chips, or processed inparallel by a plurality of cores of a plurality of chips.

FIG. 22 is a block diagram showing a configuration of first decoder4920. First decoder 4920 reproduces point cloud data by decoding encodeddata (encoded stream) generated by encoding the point cloud data in thefirst encoding method (GPCC). First decoder 4920 includes demultiplexer4921, a plurality of geometry information decoders 4922, a plurality ofattribute information decoders 4923, additional information decoder4924, and combiner 4925.

Demultiplexer 4921 generates a plurality of pieces of encoded geometryinformation, a plurality of pieces of encoded attribute information, andencoded additional information by demultiplexing encoded data (encodedstream).

The plurality of geometry information decoders 4922 generate a pluralityof pieces of divided geometry information by decoding a plurality ofpieces of encoded geometry information. For example, the plurality ofgeometry information decoders 4922 process a plurality of pieces ofencoded geometry information in parallel.

The plurality of attribute information decoders 4923 generate aplurality of pieces of divided attribute information by decoding aplurality of pieces of encoded attribute information. For example, theplurality of attribute information decoders 4923 process a plurality ofpieces of encoded attribute information in parallel.

Additional information decoder 4924 generates additional information bydecoding encoded additional information.

Combiner 4925 generates geometry information by combining a plurality ofpieces of divided geometry information using additional information.Combiner 4925 generates attribute information by combining a pluralityof pieces of divided attribute information using additional information.

Note that, although FIG. 22 shows an example in which there are twogeometry information decoders 4922 and two attribute informationdecoders 4923, the number of geometry information decoders 4922 and thenumber of attribute information decoders 4923 may be one, or three ormore. The plurality of pieces of divided data may be processed inparallel in the same chip, such as by a plurality of cores of a CPU,processed in parallel by cores of a plurality of chips, or processed inparallel by a plurality of cores of a plurality of chips.

Next, a configuration of divider 4911 will be described. FIG. 23 is ablock diagram showing divider 4911. Divider 4911 includes slice divider4931, geometry information tile divider (geometry tile divider) 4932,attribute information tile divider (attribute tile divider) 4933.

Slice divider 4931 generates a plurality of pieces of slice geometryinformation by dividing geometry information (position (geometry)) intoslices. Slice divider 4931 also generates a plurality of pieces of sliceattribute information by dividing attribute information (attribute) intoslices. Slice divider 4931 also outputs slice additional information(slice metadata) including information concerning the slice division andinformation generated in the slice division.

Geometry information tile divider 4932 generates a plurality of piecesof divided geometry information (a plurality of pieces of tile geometryinformation) by dividing a plurality of pieces of slice geometryinformation into tiles. Geometry information tile divider 4932 alsooutputs geometry tile additional information (geometry tile metadata)including information concerning the tile division of geometryinformation and information generated in the tile division of geometryinformation.

Attribute information tile divider 4933 generates a plurality of piecesof divided attribute information (a plurality of pieces of tileattribute information) by dividing a plurality of pieces of sliceattribute information into tiles. Attribute information tile divider4933 also outputs attribute tile additional information (attribute tilemetadata) including information concerning the tile division ofattribute information and information generated in the tile division ofattribute information.

Note that the number of slices or tiles generated by division is equalto or greater than 1. That is, the slice division or tile division maynot be performed.

Although an example in which tile division is performed after slicedivision has been shown here, slice division may be performed after tiledivision. Alternatively, other units of division may be defined inaddition to slice and tile, and the division may be performed based onthree or more units of division.

Hereinafter, the dividing method for point cloud data will be described.FIG. 24 is a diagram illustrating an example of slice and tile dividing.

First, the method for slice dividing will be described. Divider 4911divides three-dimensional point cloud data into arbitrary point cloudson a slice-by-slice basis. In slice dividing, divider 4911 does notdivide the geometry information and the attribute informationconstituting points, but collectively divides the geometry informationand the attribute information. That is, divider 4911 performs slicedividing so that the geometry information and the attribute informationof an arbitrary point belong to the same slice. Note that, as long asthese are followed, the number of divisions and the dividing method maybe any number and any method. Furthermore, the minimum unit of divisionis a point. For example, the numbers of divisions of geometryinformation and attribute information are the same. For example, athree-dimensional point corresponding to geometry information afterslice dividing, and a three-dimensional point corresponding to attributeinformation are included in the same slice.

Also, divider 4911 generates slice additional information, which isadditional information related to the number of divisions and thedividing method at the time of slice dividing. The slice additionalinformation is the same for geometry information and attributeinformation. For example, the slice additional information includes theinformation indicating the reference coordinate position, size, or sidelength of a bounding box after division. Also, the slice additionalinformation includes the information indicating the number of divisions,the division type, etc.

Next, the method for tile dividing will be described. Divider 4911divides the data divided into slices into slice geometry information (Gslice) and slice attribute information (A slice), and divides each ofthe slice geometry information and the slice attribute information on atile-by-tile basis.

Note that, although FIG. 24 illustrates the example in which division isperformed with an octree structure, the number of divisions and thedividing method may be any number and any method.

Also, divider 4911 may divide geometry information and attributeinformation with different dividing methods, or may divide geometryinformation and attribute information with the same dividing method.Additionally, divider 4911 may divide a plurality of slices into tileswith different dividing methods, or may divide a plurality of slicesinto tiles with the same dividing method.

Furthermore, divider 4911 generates tile additional information relatedto the number of divisions and the dividing method at the time of tiledividing. The tile additional information (geometry tile additionalinformation and attribute tile additional information) is separate forgeometry information and attribute information. For example, the tileadditional information includes the information indicating the referencecoordinate position, size, or side length of a bounding box afterdivision. Additionally, the tile additional information includes theinformation indicating the number of divisions, the division type, etc.

Next, an example of the method of dividing point cloud data into slicesor tiles will be described. As the method for slice or tile dividing,divider 4911 may use a predetermined method, or may adaptively switchmethods to be used according to point cloud data.

At the time of slice dividing, divider 4911 divides a three-dimensionalspace by collectively handling geometry information and attributeinformation. For example, divider 4911 determines the shape of anobject, and divides a three-dimensional space into slices according tothe shape of the object. For example, divider 4911 extracts objects suchas trees or buildings, and performs division on an object-by-objectbasis. For example, divider 4911 performs slice dividing so that theentirety of one or a plurality of objects are included in one slice.Alternatively, divider 4911 divides one object into a plurality ofslices.

In this case, the encoding device may change the encoding method foreach slice, for example. For example, the encoding device may use ahigh-quality compression method for a specific object or a specific partof the object. In this case, the encoding device may store theinformation indicating the encoding method for each slice in additionalinformation (metadata).

Also, divider 4911 may perform slice dividing so that each slicecorresponds to a predetermined coordinate space based on map informationor geometry information.

At the time of tile dividing, divider 4911 separately divides geometryinformation and attribute information. For example, divider 4911 dividesslices into tiles according to the data amount or the processing amount.For example, divider 4911 determines whether the data amount of a slice(for example, the number of three-dimensional points included in aslice) is greater than a predetermined threshold value. When the dataamount of the slice is greater than the threshold value, divider 4911divides slices into tiles. When the data amount of the slice is lessthan the threshold value, divider 4911 does not divide slices intotiles.

For example, divider 4911 divides slices into tiles so that theprocessing amount or processing time in the decoding device is within acertain range (equal to or less than a predetermined value).Accordingly, the processing amount per tile in the decoding devicebecomes constant, and distributed processing in the decoding devicebecomes easy.

Additionally, when the processing amount is different between geometryinformation and attribute information, for example, when the processingamount of geometry information is greater than the processing amount ofattribute information, divider 4911 makes the number of divisions ofgeometry information larger than the number of divisions of attributeinformation.

Furthermore, for example, when geometry information may be decoded anddisplayed earlier, and attribute information may be slowly decoded anddisplayed later in the decoding device according to contents, divider4911 may make the number of divisions of geometry information largerthan the number of divisions of attribute information. Accordingly,since the decoding device can increase the parallel number of geometryinformation, it is possible to make the processing of geometryinformation faster than the processing of attribute information.

Note that the decoding device does not necessarily have to processsliced or tiled data in parallel, and may determine whether or not toprocess them in parallel according to the number or capability ofdecoding processors.

By performing division with the method as described above, it ispossible to achieve adaptive encoding according to contents or objects.Also, parallel processing in decoding processing can be achieved.Accordingly, the flexibility of a point cloud encoding system or a pointcloud decoding system is improved.

FIG. 25 is a diagram illustrating dividing pattern examples of slicesand tiles. DU in the diagram is a data unit (DataUnit), and indicatesthe data of a tile or a slice. Additionally, each DU includes a sliceindex (SliceIndex) and a tile index (TileIndex). The top right numericalvalue of a DU in the diagram indicates the slice index, and the bottomleft numerical value of the DU indicates the tile index.

In Pattern 1, in slice dividing, the number of divisions and thedividing method are the same for G slice and A slice. In tile dividing,the number of divisions and the dividing method for G slice aredifferent from the number of divisions and the dividing method for Aslice. Additionally, the same number of divisions and dividing methodare used among a plurality of G slices. The same number of divisions anddividing method are used among a plurality of A slices.

In Pattern 2, in slice dividing, the number of divisions and thedividing method are the same for G slice and A slice. In tile dividing,the number of divisions and the dividing method for G slice aredifferent from the number of divisions and the dividing method for Aslice. Additionally, the number of divisions and the dividing method aredifferent among a plurality of G slices. The number of divisions and thedividing method are different among a plurality of A slices.

Next, a method of encoding divided data will be described. Thethree-dimensional data encoding device (first encoder 4910) encodes eachpiece of divided data. When encoding attribute information, thethree-dimensional data encoding device generates, as additionalinformation, dependency information that indicates on whichconfiguration information (geometry information, additional information,or other attribute information) the encoding is based. That is, thedependency information indicates configuration information on areference destination (dependency destination). In this case, thethree-dimensional data encoding device generates dependency informationbased on configuration information corresponding to a pattern ofdivision of attribute information. Note that the three-dimensional dataencoding device may generate dependency information based onconfiguration information for a plurality of patterns of division ofattribute information.

The dependency information may be generated by the three-dimensionaldata encoding device, and the generated dependency information may betransmitted to a three-dimensional data decoding device. Alternatively,the three-dimensional data decoding device may generate dependencyinformation, and the three-dimensional data encoding device may transmitno dependency information. Alternatively, a dependency used by thethree-dimensional data encoding device may be previously determined, andthe three-dimensional data encoding device may transmit no dependencyinformation.

FIG. 26 is a diagram showing an example of the dependency between data.In the drawing, the destination of an arrow indicates a dependencydestination, and the source of an arrow indicates a dependency source.The three-dimensional data decoding device first decodes data concerninga dependency destination and then decodes data concerning a dependencysource. Data indicated by a solid line in the drawing is data that isactually transmitted, and data indicated by a dotted line is data thatis not transmitted.

In the drawing, G denotes geometry information, and A denotes attributeinformation. G_(s1) denotes geometry information concerning slice number1, and G_(s2) denotes geometry information concerning slice number 2.G_(s1t1) denotes geometry information concerning slice number 1 and tilenumber 1, G_(s1t2) denotes geometry information concerning slice number1 and tile number 2, G_(s2t1) denotes geometry information concerningslice number 2 and tile number 1, and G_(s2t2) denotes geometryinformation concerning slice number 2 and tile number 2. Similarly,A_(s1) denotes attribute information concerning slice number 1, andA_(s2) denotes attribute information concerning slice number 2. A_(s1t1)denotes attribute information concerning slice number 1 and tile number1, A_(s1t2) denotes attribute information concerning slice number 1 andtile number 2, A_(s2t1) denotes attribute information concerning slicenumber 2 and tile number 1, and A_(s2t2) denotes attribute informationconcerning slice number 2 and tile number 2.

M_(slice) denotes slice additional information, MG_(tile) denotesgeometry tile additional information, and MA_(tile) denotes attributetile additional information. D_(s1t1) denotes dependency information forattribute information A_(s1t1), and D_(s2t1) denotes dependencyinformation for attribute information A_(s2t1).

The three-dimensional data encoding device may rearrange data in theorder of decoding so that the three-dimensional data decoding devicedoes not need to rearrange data. Note that the three-dimensional datadecoding device may rearrange data, or both the three-dimensional dataencoding device and the three-dimensional data decoding device mayrearrange data.

FIG. 27 is a diagram showing an example of the order of decoding ofdata. In the example in FIG. 27, data is decoded in order from left toright. When there is a dependency between data to be decoded, thethree-dimensional data decoding device first decodes data on thedependency destination. For example, the three-dimensional data encodingdevice transmits the data after rearranging the data in that order. Notethat the order can be any order as far as the data concerning thedependency destination is first decoded. The three-dimensional dataencoding device may transmit additional information and dependencyinformation before transmitting data.

FIG. 28 is a flowchart showing a flow of a process performed by thethree-dimensional data encoding device. First, the three-dimensionaldata encoding device encodes a plurality of slices or tiles of data asdescribed above (S4901). The three-dimensional data encoding device thenrearrange the data so that the data concerning the dependencydestination comes first as shown in FIG. 27 (S4902). Thethree-dimensional data encoding device then multiplexes the rearrangeddata (into a NAL unit) (S4903).

Next, a configuration of combiner 4925 included in first decoder 4920will be described. FIG. 29 is a block diagram showing a configuration ofcombiner 4925. Combiner 4925 includes geometry information tile combiner(geometry tile combiner) 4941, attribute information tile combiner(attribute tile combiner) 4942, and slice combiner 4943.

Geometry information tile combiner 4941 generates a plurality of piecesof slice geometry information by combining a plurality of pieces ofdivided geometry information using geometry tile additional information.Attribute information tile combiner 4942 generates a plurality of piecesof slice attribute information by combining a plurality of pieces ofdivided attribute information using attribute tile additionalinformation.

Slice combiner 4943 generates geometry information by combining aplurality of pieces of slice geometry information using slice additionalinformation. Slice combiner 4943 also generates attribute information bycombining a plurality of pieces of slice attribute information usingslice additional information.

Note that the number of slices or tiles generated by division is equalto or greater than 1. That is, the slice division or tile division maynot be performed.

Furthermore, although an example in which tile division is performedafter slice division has been shown here, slice division may beperformed after tile division. Alternatively, other units of divisionmay be defined in addition to slice and tile, and the division may beperformed based on three or more units of division.

Next, a configuration of encoded data divided into slices or tiles, anda method of storing (multiplexing) encoded data into a NAL unit will bedescribed. FIG. 30 is a diagram showing a configuration of encoded dataand a method of storing encoded data into a NAL unit.

Encoded data (divided geometry information and divided attributeinformation) is stored in a payload of a NAL unit.

Encoded data includes a header and a payload. The header includesidentification information for identifying data included in the payload.The identification information includes a type (slice_type, tile_type)of slice division or tile division, index information (slice_idx,tile_idx) for identifying a slice or tile, geometry information on data(slice or tile), or an address (address) of data, for example. The indexinformation for identifying a slice is referred to also as a slice index(SliceIndex). The index information for identifying a tile is referredto also as a tile index (TileIndex). The type of division may be ascheme based on an object shape, a scheme based on map information orgeometry information, or a scheme based on a data amount or processingamount, for example.

All or part of the information described above may be stored in one ofthe header of the divided geometry information and the header of thedivided attribute information and not be stored in the other. Forexample, when the same division method is used for the geometryinformation and the attribute information, the same type of division(slice_type, tile_type) and the same index information (slice_idx,tile_idx) are used for the geometry information and the attributeinformation. Therefore, these pieces of information may be included inthe header of one of the geometry information and the attributeinformation. For example, when the attribute information depends on thegeometry information, the geometry information is processed first.Therefore, the header of the geometry information may include thesepieces of information, and the header of the attribute information maynot include these pieces of information. In this case, thethree-dimensional data decoding device determines that the attributeinformation concerning the dependency source belongs to the same sliceor tile as the slice or tile of the geometry information concerning thedependency destination, for example.

The additional information (slice additional information, geometry tileadditional information, or attribute tile additional information)concerning the slice division or tile division, dependency informationindicating a dependency and the like may be stored in an existingparameter set (GPS, APS, geometry SPS, attribute SPS or the like) andtransmitted. When the division method varies with frame, informationindicating a division method may be stored in a parameter set (GPS, APSor the like) for each frame. When the division method does not vary in asequence, information indicating a division method may be stored in aparameter set (geometry SPS or attribute SPS) for each sequence.Furthermore, when the same division method is used for the geometryinformation and the attribute information, information indicating thedivision method may be stored in a parameter set (stream PS) for the PCCstream.

The information described above may be stored in any of the parametersets described above, or may be stored in a plurality of parameter sets.Alternatively, a parameter set for tile division or slice division maybe defined, and the information described above may be stored in theparameter set. Alternatively, these pieces of information may be storedin the header of encoded data.

The header of encoded data includes identification informationindicating a dependency. That is, when there is a dependency betweendata, the header includes identification information that allows thedependency source to refer to the dependency destination. For example,the header of the data of the dependency destination includesidentification information for identifying the data. The header of thedata of the dependency source includes identification informationindicating the dependency destination. Note that the identificationinformation for identifying data, the additional information concerningslice division or tile division, and the identification informationindicating a dependency may be omitted if these pieces of informationcan be identified or derived from other information.

Next, a flow of a process of encoding point cloud data and a flow of aprocess of decoding point cloud data according to this embodiment willbe described. FIG. 31 is a flowchart of a process of encoding pointcloud data according to this embodiment.

First, the three-dimensional data encoding device determines a divisionmethod to be used (S4911). The division method includes a determinationof whether to perform slice division or not and a determination ofwhether to perform tile division. The division method may include thenumber of slices or tiles in the case where slice division or tiledivision is performed, and the type of division, for example. The typeof division is a scheme based on an object shape, a scheme based on mapinformation or geometry information, or a scheme based on a data amountor processing amount, for example. The division method may be determinedin advance.

When slice division is to be performed (if Yes in S4912), thethree-dimensional data encoding device generates a plurality of piecesof slice geometry information and a plurality of pieces of sliceattribute information by collectively dividing the geometry informationand the attribute information (S4913). The three-dimensional dataencoding device also generates slice additional information concerningthe slice division. Note that the three-dimensional data encoding devicemay independently divide the geometry information and the attributeinformation.

When tile division is to be performed (if Yes in S4914), thethree-dimensional data encoding device generates a plurality of piecesof divided geometry information and a plurality of pieces of dividedattribute information by independently dividing the plurality of piecesof slice geometry information and the plurality of pieces of sliceattribute information (or the geometry information and the attributeinformation) (S4915). The three-dimensional data encoding device alsogenerates geometry tile additional information and attribute tileadditional information concerning the tile division. Thethree-dimensional data encoding device may collectively divide the slicegeometry information and the slice attribute information.

The three-dimensional data encoding device then generates a plurality ofpieces of encoded geometry information and a plurality of pieces ofencoded attribute information by encoding each of the plurality ofpieces of divided geometry information and the plurality of pieces ofdivided attribute information (S4916). The three-dimensional dataencoding device also generates dependency information.

The three-dimensional data encoding device then generates encoded data(encoded stream) by integrating (multiplexing) the plurality of piecesof encoded geometry information, the plurality of pieces of encodedattribute information and the additional information into a NAL unit(S4917). The three-dimensional data encoding device also transmits thegenerated encoded data.

FIG. 32 is a flowchart of a process of decoding point cloud dataaccording to this embodiment. First, the three-dimensional data decodingdevice determines the division method by analyzing additionalinformation (slice additional information, geometry tile additionalinformation, and attribute tile additional information) concerning thedivision method included in encoded data (encoded stream) (S4921). Thedivision method includes a determination of whether to perform slicedivision or not and a determination of whether to perform tile divisionor not. The division method may include the number of slices or tiles inthe case where slice division or tile division is performed, and thetype of division, for example.

The three-dimensional data decoding device then generates dividedgeometry information and divided attribute information by decoding aplurality of pieces of encoded geometry information and a plurality ofpieces of encoded attribute information included in the encoded datausing dependency information included in the encoded data (S4922).

If the additional information indicates that tile division has beenperformed (if Yes in S4923), the three-dimensional data decoding devicegenerates a plurality of pieces of slice geometry information and aplurality of pieces of slice attribute information by combining theplurality of pieces of divided geometry information and the plurality ofpieces of divided attribute information in respective manners based onthe geometry tile additional information and the attribute tileadditional information (S4924). Note that the three-dimensional datadecoding device may combine the plurality of pieces of divided geometryinformation and the plurality of pieces of divided attribute informationin the same manner.

If the additional information indicates that slice division has beenperformed (if Yes in S4925), the three-dimensional data decoding devicegenerates geometry information and attribute information by combiningthe plurality of pieces of slice geometry information and the pluralityof pieces of slice attribute information (the plurality of pieces ofdivided geometry information and the plurality of pieces of dividedattribute information) in the same manner based on the slice additionalinformation (S4926). Note that the three-dimensional data decodingdevice may combine the plurality of pieces of slice geometry informationand the plurality of pieces of slice attribute information in differentmanners.

As described above, the three-dimensional data encoding device accordingto this embodiment performs the process shown in FIG. 33. First, thethree-dimensional data encoding device divides data into a plurality ofpieces of divided data (tiles, for example) that are included in aplurality of subspaces (slices, for example) generated by dividing atarget space including a plurality of three-dimensional points and eachof which includes one or more three-dimensional points. Here, thedivided data is a collection of one or more pieces of data including oneor more three-dimensional points that is included in a subspace. Thedivided data can also be regarded as a space and may include a spaceincluding no three-dimensional point. One subspace may include aplurality of pieces of divided data, or one subspace may include onepiece of divided data. Note that a plurality of subspaces or onesubspace may be set in a target space.

The three-dimensional data encoding device then generates a plurality ofpieces of encoded data each associated with a different one of theplurality of pieces of divided data by encoding each of the plurality ofpieces of divided data (S4931). The three-dimensional data encodingdevice generates a bitstream including the plurality of pieces ofencoded data and a plurality of pieces of control information (theheader shown in FIG. 30, for example) each associated with a differentone of the plurality of pieces of encoded data (S4932). In each of theplurality of pieces of control information, a first identifier(slice_idx, for example) that indicates a subspace associated with thepiece of encoded data associated with the piece of control informationand a second identifier (tile_idx, for example) that indicates a pieceof divided data associated with the piece of encoded data associatedwith the piece of control information are stored.

With such a configuration, the three-dimensional data decoding devicethat decodes the bitstream generated by the three-dimensional dataencoding device can easily reproduce the target space by combining theplurality of pieces of divided data using the first identifier and thesecond identifier. Therefore, the processing amount of thethree-dimensional data decoding device can be reduced.

For example, in the encoding described above, the three-dimensional dataencoding device encodes the geometry information and the attributeinformation on the three-dimensional points included in each of theplurality of pieces of divided data. Each of the plurality of pieces ofencoded data includes encoded data of the geometry information and theencoded data of the attribute information. Each of the plurality ofpieces of control information includes the control information for theencoded data of the geometry information and the control information forthe encoded data of the attribute information. The first identifier andthe second identifier are stored in the control information for theencoded data of the geometry information.

For example, in the bitstream, each of the plurality of pieces ofcontrol information is arranged to precede the encoded data associatedwith the control information.

One or more subspaces are set in a target space including a plurality ofthree-dimensional points, and each subspace includes one or more piecesof divided data each including one or more three-dimensional points. Thethree-dimensional data encoding device generates a plurality of piecesof encoded data each associated with a different one of a plurality ofpieces of divided data by encoding each of the plurality of pieces ofdivided data, and generates a bitstream including the plurality ofpieces of encoded data and a plurality of pieces of control informationeach associated with a different one of the plurality of pieces ofencoded data, and each of the plurality of pieces of control informationmay store the first identifier that indicates a subspace associated withthe piece of encoded data associated with the piece of controlinformation and the second identifier that indicates a piece of divideddata associated with the piece of encoded data associated with the pieceof control information.

For example, the three-dimensional data encoding device includes aprocessor and memory, and the processor performs the processes describedabove using the memory.

The three-dimensional data decoding device according to this embodimentperforms the process shown in FIG. 34. First, from a bitstream includinga plurality of pieces of encoded data generated by encoding of each of aplurality of pieces of divided data (tiles, for example) that areincluded in a plurality of subspaces (slices, for example) generated bydividing a target space including a plurality of three-dimensionalpoints and each of which includes one or more three-dimensional points,and a plurality of pieces of control information (the header shown inFIG. 30, for example) for each of the plurality of pieces of encodeddata, the three-dimensional data decoding device obtains the firstidentifier (slice_idx, for example) that indicates a subspace associatedwith the piece of encoded data associated with the piece of controlinformation and the second identifier (tile_idx, for example) thatindicates a piece of divided data associated with the piece of encodeddata associated with the piece of control information, which areincluded in the plurality of pieces of control information (S4941). Thethree-dimensional data decoding device then reproduces the plurality ofpieces of divided data by decoding the plurality of pieces of encodeddata (S4942). The three-dimensional data decoding device then reproducesthe target space by combining the plurality of pieces of divided datausing the first identifier and the second identifier (S4943). Forexample, the three-dimensional data decoding device reproduces theplurality of subspaces by combining the plurality of pieces of divideddata using the second identifier, and reproduces the target space (theplurality of three-dimensional points) by combining the plurality ofsubspaces using the first identifier. Note that the three-dimensionaldata decoding device may obtain encoded data of a desired subspace ordesired divided data from the bitstream using at least one of the firstidentifier and the second identifier, and selectively or preferentiallydecode the obtained encoded data.

With such a configuration, the three-dimensional data decoding devicecan easily reproduce the target space by combining the plurality ofpieces of divided data using the first identifier and the secondidentifier. Therefore, the processing amount of the three-dimensionaldata decoding device can be reduced.

For example, each of a plurality of pieces of encoded data is generatedby encoding geometry information and attribute information on athree-dimensional point included in an associated piece of divided data,and includes encoded data of the geometry information and encoded dataof the attribute information. Each of the plurality of pieces of controlinformation includes control information for the encoded data of thegeometry information and control information for the encoded data of theattribute information. The first identifier and the second identifierare stored in the control information for the encoded data of thegeometry information.

For example, in the bitstream, the control information is arranged toprecede the associated encoded data.

For example, the three-dimensional data decoding device includes aprocessor and memory, and the processor performs the processes describedabove using the memory.

Embodiment 5

In encoding of geometry information using neighborhood dependency, thecoding efficiency can be improved as the density of a point cloudincreases. In this embodiment, the three-dimensional data encodingdevice collectively encodes point cloud data of successive frames bycombining the point cloud data of the successive frames. In thisprocess, the three-dimensional data encoding device generates encodeddata additionally including information for identifying a frame to whicheach leaf node included in the combined point cloud data belongs.

Here, point cloud data of successive frames are likely to be similar toeach other. That is, occupancy codes for successive frames are likely tohave a common higher-level part. In other words, occupancy codes forsuccessive frames can share a higher-level part if the successive framesare collectively encoded.

By encoding an index of a frame, a determination of to which frame apoint cloud belongs is made at a leaf node.

FIG. 35 is a diagram showing a concept of generation of a tree structureand an occupancy code from point cloud data of N point cloud compression(PCC) frames. In this drawing, a point in a hollow arrow indicates apoint that belongs to a PCC frame. First, a frame index for identifyinga frame is assigned to a point that belongs to each PCC frame.

Points belonging to the N frames are then converted into a treestructure, and an occupancy code is generated. Specifically, to whichleaf node in the tree structure each point belongs is determined. In thedrawing, the tree structure represents a set of nodes. The determinationof to which node a point belongs is made beginning with thehighest-level node. The determination result for each node is encodedinto an occupancy code. The occupancy code is common among the N frames.

A node can include points belonging to different frames to whichdifferent frame indices are assigned. When the octree has a lowresolution, a node can include points belonging to the same frame towhich the same frame index is assigned.

In a lowest-level node (leaf node), points belonging to a plurality offrames can be mixed (duplicated).

As for the tree structure and the occupancy code, a higher-level part ofthe tree structure and occupancy codes in the higher-level part can be acommon component for all the frames, and a lower-level part of the treestructure and occupancy codes in the lower-level part can be anindividual component for each frame or can be partially a commoncomponent and partially an individual component.

For example, at a lowest-level node, such as a leaf node, zero or morepoints having a frame index are generated, and information indicatingthe number of points and information on the frame index of each pointare generated. These pieces of information can be regarded as individualinformation for frames.

FIG. 36 is a diagram showing an example of frame combining. As shown inpart (a) of FIG. 36, if a tree structure is generated by combining aplurality of frames, the density of the points of the frames included inthe same node increases. In addition, if the tree structure is shared,the data amount of the occupancy codes can be reduced. In this way, thecoding efficiency can be improved.

As shown in part (b) of FIG. 36, as the individual components of theoccupancy codes in the tree structure become denser, the effectivenessof the arithmetic encoding increases, so that the coding efficiency canbe improved.

In the following, combining of a plurality of PCC frames associated withdifferent times will be described as an example. However, thedescription holds true for a case where there is not a plurality offrames, that is, frame combining is not performed (N=1). Furthermore,the plurality of pieces of point cloud data to be combined is notlimited to a plurality of frames, that is, a plurality of pieces ofpoint cloud data on the same object associated with different timepoints. That is, the method described below can be applied to combiningof a plurality of pieces of point cloud data associated with differentspaces or different times and spaces. The method described below canalso be applied to combining of point cloud data or point cloud files ofdifferent contents.

FIG. 37 is a diagram showing an example of combining of a plurality ofPCC frames associated with different times. FIG. 37 shows an example inwhich an automobile obtains point cloud data with a sensor such as LiDARwhile the automobile is moving. A dotted line indicates an effectiverange of the sensor in each frame, that is, a range of point cloud data.As the effective range of the sensor increases, the range of the pointcloud data also increases.

The method of combining and encoding point cloud data is effective forpoint cloud data, such as point cloud data described below. For example,in the example shown in FIG. 37, the automobile is moving, and a frameis identified by 360° scanning of the periphery of the automobile. Thatis, frame 2, the frame following frame 1, corresponds to another 360°scanning performed when the vehicle has moved in an X direction.

In this case, frame 1 and frame 2 partially overlap with each other andtherefore can include common point cloud data. Therefore, if frame 1 andframe 2 are combined and encoded, the coding efficiency can be improved.Note that more frames may be able to be combined. However, as the numberof frames combined increases, the number of bits required for encodingof the frame indices assigned to the lead nodes increases.

Alternatively, point cloud data may be obtained by sensors at differentpositions. In that case, each piece of point cloud data obtained at adifferent position can be used as a frame. That is, the plurality offrames may be point cloud data obtained by a single sensor or pointcloud data obtained by a plurality of sensors. Furthermore, objects maybe partially or totally the same or may be different in the plurality offrames.

Next, a flow of a three-dimensional data encoding process according tothis embodiment will be described. FIG. 38 is a flowchart of thethree-dimensional data encoding process. According to the combined framecount N, which is the number of frames to be combined, thethree-dimensional data encoding device reads point cloud data of all theN frames.

First, the three-dimensional data encoding device determines thecombined frame count N (S5401). For example, the combined frame countNis specified by a user.

The three-dimensional data encoding device then obtains point cloud data(S5402). The three-dimensional data encoding device then records frameindices of the obtained point cloud data (S5403).

When the N frames have not been processed (if No in S5404), thethree-dimensional data encoding device specifies next point cloud data(S5405), and performs step S5402 and the following processing on thespecified point cloud data.

On the other hand, when the N frames have been processed (if Yes inS5404), the three-dimensional data encoding device combines the N framesand encodes the resulting, combined frame (S5406).

FIG. 39 is a flowchart of the encoding process (S5406). First, thethree-dimensional data encoding device generates common information thatis common to the N frames (S5411). For example, the common informationincludes an occupancy code and information indicating the combined framecount N.

The three-dimensional data encoding device then generates individualinformation that is individual information on each frame (S5412). Forexample, the individual information includes the number of pointsincluded in a leaf node, and the frame indices of the points included inthe leaf node.

The three-dimensional data encoding device then combines the commoninformation and the individual information, and generates encoded databy encoding the combined information (S5413). The three-dimensional dataencoding device then generates additional information (metadata)concerning the frame combining, and encodes the generated additionalinformation (S5414).

Next, a flow of a three-dimensional data decoding process according tothis embodiment will be described. FIG. 40 is a flowchart of thethree-dimensional data decoding process.

First, the three-dimensional data decoding device obtains the combinedframe count N from a bitstream (S5421). The three-dimensional datadecoding device then obtains encoded data from the bitstream (S5422).The three-dimensional data decoding device decodes the encoded data toobtain point cloud data and frame indices (S5423). Finally, thethree-dimensional data decoding device divides the decoded point clouddata using the frame indices (S5424).

FIG. 41 is a flowchart of the decoding and dividing process (S5423 andS5424). First, the three-dimensional data decoding device decodes theencoded data (bitstream) into common information and individualinformation (that is, obtains common information and individualinformation from the encoded data) (S5431).

The three-dimensional data decoding device then determines whether todecode a single frame or to decode a plurality of frames (S432). Forexample, whether to decode a single frame or to decode a plurality offrames may be externally specified. Here, the plurality of frames may beall the frames combined or some of the frames combined. For example, thethree-dimensional data decoding device may determine to decode aparticular frame required by an application, and not to decode theframes that are not required. Alternatively, when real-time decoding isrequired, the three-dimensional data decoding device may determine todecode a single frame of the plurality of frames combined.

When decoding a single frame (if Yes in S5432), the three-dimensionaldata decoding device extracts individual information associated with theframe index of the specified single frame from the decoded individualinformation, and decodes the extracted individual information toreproduce point cloud data of the specified frame corresponding to theframe index (S5433).

On the other hand, when decoding a plurality of frames (if No in S5432),the three-dimensional data decoding device extracts individualinformation associated with the frame indices of the specified pluralityof frames (or all the frames), and decodes the extracted individualinformation to reproduce point cloud data of the specified plurality offrames (S5434). The three-dimensional data decoding device then dividesthe decoded point cloud data (individual information) based on the frameindices (S5435). That is, the three-dimensional data decoding devicedivides the decoded point cloud data into the plurality of frames.

Note that the three-dimensional data decoding device may collectivelydecode data of all the frames combined and then divide the decoded datainto frames, or collectively decode data of an arbitrary part of theframes combined and divide the decoded data into frames. Furthermore,the three-dimensional data decoding device may separately decode data ofa previously determined unit frame composed of a plurality of frames.

In the following, a configuration of the three-dimensional data encodingdevice according to this embodiment will be described. FIG. 42 is ablock diagram showing a configuration of encoder 5410 included in thethree-dimensional data encoding device according to this embodiment.Encoder 5410 generates encoded data (encoded stream) by encoding pointcloud data (point cloud). Encoder 5410 includes divider 5411, aplurality of geometry information encoders 5412, a plurality ofattribute information encoders 5413, additional information encoder5414, and multiplexer 5415.

Divider 5411 generates a plurality of pieces of divided data of aplurality of frames by dividing point cloud data of a plurality offrames. Specifically, divider 5411 generates a plurality of pieces ofdivided data by dividing a space of point cloud data of each frame intoa plurality of subspaces. Here, a subspace is a tile, a slice, or acombination of a tile and a slice. More specifically, point cloud dataincludes geometry information, attribute information (color, reflectanceor the like), and additional information. A frame number is also inputto divider 5411. Divider 5411 divides geometry information of each frameinto a plurality of pieces of divided geometry information, and dividesattribute information of each frame into a plurality of pieces ofdivided attribute information. Divider 5411 also generates additionalinformation concerning the division.

For example, divider 5411 divides a point cloud into tiles. Divider 5411then divides the resulting tiles into slices.

The plurality of geometry information encoders 5412 generates aplurality of pieces of encoded geometry information by encoding aplurality of pieces of divided geometry information. For example,geometry information encoder 5412 encodes divided geometry informationusing an N-ary tree, such as an octree. Specifically, in the case of anoctree, a target space is divided into eight nodes (subspaces), and8-bit information (occupancy code) that indicates whether each nodeincludes a point cloud or not is generated. A node including a pointcloud is further divided into eight nodes, and 8-bit information thatindicates whether each of the eight nodes includes a point cloud or notis generated. This process is repeated until a predetermined level isreached or the number of the point clouds included in a predeterminednode becomes equal to or less than a threshold. For example, theplurality of geometry information encoders 5412 process the plurality ofpieces of divided geometry information in parallel.

Attribute information encoder 4632 generates encoded attributeinformation, which is encoded data, by encoding attribute informationusing configuration information generated by geometry informationencoder 4631. For example, attribute information encoder 4632 determinesa reference point (reference node) that is to be referred to in encodinga target point (target node) to be processed based on the octreestructure generated by geometry information encoder 4631. For example,attribute information encoder 4632 refers to a node whose parent node inthe octree is the same as the parent node of the target node, ofperipheral nodes or neighboring nodes. Note that the method ofdetermining a reference relationship is not limited to this method.

The process of encoding geometry information or attribute informationmay include at least one of a quantization process, a predictionprocess, and an arithmetic encoding process. In this case, “refer to”means using a reference node for calculating a predicted value ofattribute information or using a state of a reference node (occupancyinformation that indicates whether a reference node includes a pointcloud or not, for example) for determining a parameter of encoding. Forexample, the parameter of encoding is a quantization parameter in thequantization process or a context or the like in the arithmeticencoding.

Attribute information encoders 5413 generate pieces of encoded attributeinformation by encoding pieces of divided attribute information. Forexample, attribute information encoders 5413 process pieces of dividedgeometry information in parallel.

Additional information encoder 5414 generates encoded additionalinformation by encoding additional information included in point clouddata and additional information regarding data division generated at thetime of dividing by divider 5411.

Multiplexer 5415 generates encoded data (encoded stream) by multiplexingpieces of encoded geometry information, pieces of encoded attributeinformation, and encoded additional information, and transmits thegenerated encoded data. The encoded additional information is also usedat the time of decoding.

FIG. 43 is a block diagram showing divider 5411. Divider 5411 includestile divider 5421 and slice divider 5422.

Tile divider 5421 generates a plurality of pieces of tile geometryinformation by dividing geometry information (position (geometry)) ofeach of a plurality of frames into tiles. Tile divider 5421 alsogenerates a plurality of pieces of tile attribute information bydividing attribute information (attribute) of a plurality of frames intotiles. Tile divider 5421 outputs tile additional information (tilemetadata) including information concerning the tile division andinformation generated in the tile division.

Slice divider 5422 generates a plurality of pieces of divided geometryinformation (a plurality of pieces of slice geometry information) bydividing a plurality of pieces of tile geometry information into slices.Slice divider 5422 also generates a plurality of pieces of dividedattribute information (a plurality of pieces of slice attributeinformation) by dividing a plurality of pieces of tile attributeinformation into slices. Slice divider 5422 outputs slice additionalinformation (slice metadata) including information concerning the slicedivision and information generated in the slice division.

In the dividing process, divider 5411 uses a frame number (frame index)to indicate coordinates of an origin, attribute information or the like.

FIG. 44 is a block diagram showing geometry information encoder 5412.Geometry information encoder 5412 includes frame index generator 5431and entropy encoder 5432.

Frame index generator 5431 determines a value of a frame index based ona frame number, and adds the determined frame index to geometryinformation. Entropy encoder 5432 generates encoded geometry informationby entropy-encoding divided geometry information with a frame indexadded thereto.

FIG. 45 is a block diagram showing attribute information encoder 5413.Attribute information encoder 5413 includes frame index generator 5441and entropy encoder 5442.

Frame index generator 5441 determines a value of a frame index based ona frame number, and adds the determined frame index to attributeinformation. Entropy encoder 5442 generates encoded attributeinformation by entropy-encoding divided attribute information with aframe index added thereto.

The following describes procedures of a point cloud data encodingprocess and a point cloud data decoding process according to the presentembodiment. FIG. 46 is a flowchart of a point cloud data encodingprocess according to the present embodiment.

First, the three-dimensional data encoding device determines a divisionmethod to be used (S5441). Examples of the division method include tiledivision and slice division. A division method may include a divisionnumber, a division type, etc. when tile division or slice division isperformed.

When tile division is performed (YES in S5442), the three-dimensionaldata encoding device generates pieces of tile geometry information andpieces of tile attribute information by dividing geometry informationand attribute information collectively (S5443). Besides, thethree-dimensional data encoding device generates tile additionalinformation regarding the tile division.

When slice division is performed (YES in S5444), the three-dimensionaldata encoding device generates pieces of divided geometry informationand pieces of divided attribute information by dividing the pieces oftile geometry information and the pieces of tile attribute information(or the geometry information and the attribute information) separately(S5445). Also, the three-dimensional data encoding device generatesgeometry slice additional information and attribute slice additionalinformation regarding the slice division.

Next, the three-dimensional data encoding device generates pieces ofencoded geometry information and pieces of encoded attribute informationby respectively encoding the pieces of divided geometry information andthe pieces of divided attribute information as frame indexes (S5446). Inaddition, the three-dimensional data encoding device generatesdependency relationship information.

Finally, the three-dimensional data encoding device generates encodeddata (an encoded stream) by storing in NAL units (multiplexing) thepieces of encoded geometry information, the pieces of encoded attributeinformation, and additional information (S5447). Additionally, thethree-dimensional data encoding device transmits the generated encodeddata.

FIG. 47 is a flowchart of the encoding process (S5446). First, thethree-dimensional data encoding device encodes divided geometryinformation (S5451). The three-dimensional data encoding device thenencodes a frame index for the divided geometry information (S5452).

When there is divided attribute information (if Yes in S5453), thethree-dimensional data encoding device encodes the divided attributeinformation (S5454), and encodes a frame index for the divided attributeinformation (S5455). On the other hand, when there is no dividedattribute information (if No in S5453), the three-dimensional dataencoding device does not perform encoding of any divided attributeinformation and encoding of a frame index for any divided attributeinformation. Note that the frame index may be stored in any one or bothof the divided geometry information and the divided attributeinformation.

Note that the three-dimensional data encoding device may encodeattribute information using a frame index or without using a frameindex. That is, the three-dimensional data encoding device may identifya frame to which each point belongs using a frame index and performencoding on a frame basis, or may encode the points belonging to all theframes without identifying the frames.

In the following, a configuration of the three-dimensional data decodingdevice according to this embodiment will be described. FIG. 48 is ablock diagram showing a configuration of decoder 5450. Decoder 5450reproduces point cloud data by decoding encoded data (encoded stream)generated by encoding the point cloud data. Decoder 5450 includesdemultiplexer 5451, a plurality of geometry information decoders 5452, aplurality of attribute information decoders 5453, additional informationdecoder 5454, and combiner 5455.

Demultiplexer 5451 generates a plurality of pieces of encoded geometryinformation, a plurality of pieces of encoded attribute information, andencoded additional information by demultiplexing encoded data (encodedstream).

The plurality of geometry information decoders 5452 generate a pluralityof pieces of divided geometry information by decoding a plurality ofpieces of encoded geometry information. For example, the plurality ofgeometry information decoders 5452 process a plurality of pieces ofencoded geometry information in parallel.

The plurality of attribute information decoders 5453 generate aplurality of pieces of divided attribute information by decoding aplurality of pieces of encoded attribute information. For example, theplurality of attribute information decoders 5453 process a plurality ofpieces of encoded attribute information in parallel.

The plurality of additional information decoders 5454 generateadditional information by decoding encoded additional information.

Combiner 5455 generates geometry information by combining a plurality ofpieces of divided geometry information using additional information.Combiner 5455 generates attribute information by combining a pluralityof pieces of divided attribute information using additional information.Combiner 5455 also divides geometry information and attributeinformation into geometry information of a plurality of frames andattribute information of a plurality of frames using frame indices.

FIG. 49 is a block diagram showing geometry information decoder 5452.Geometry information decoder 5452 includes entropy decoder 5461 andframe index obtainer 5462. Entropy decoder 5461 generates dividedgeometry information by entropy-decoding encoded geometry information.Frame index obtainer 5462 obtains a frame index from divided geometryinformation.

FIG. 50 is a block diagram showing attribute information decoder 5453.Attribute information decoder 5453 includes entropy decoder 5471 andframe index obtainer 5472. Entropy decoder 5471 generates dividedattribute information by entropy-decoding encoded attribute information.Frame index obtainer 5472 obtains a frame index from divided attributeinformation.

FIG. 51 is a diagram showing a configuration of combiner 5455. Combiner5455 generates geometry information by combining a plurality of piecesof divided geometry information. Combiner 5455 generates attributeinformation by combining a plurality of pieces of divided attributeinformation. Combiner 5455 also divides geometry information andattribute information into geometry information of a plurality of framesand attribute information of a plurality of frames using frame indices.

FIG. 52 is a flowchart of a point cloud data decoding process accordingto the present embodiment. First, the three-dimensional data decodingdevice determines a division method by analyzing additional information(slice additional information and tile additional information) regardinga division method included in encoded data (an encoded stream) (S5461).Examples of the division method include tile division and slicedivision. A division method may include a division number, a divisiontype, etc. when tile division or slice division is performed.

Next, the three-dimensional data decoding device generates dividedgeometry information and divided attribute information by decodingpieces of encoded geometry information and pieces of encoded attributeinformation included in the encoded data, using dependency relationshipinformation included in the encoded data (S5462).

When the additional information indicates that slice division has beenperformed (YES in S5463), the three-dimensional data decoding devicegenerates pieces of tile geometry information and pieces of tileattribute information by combining pieces of divided geometryinformation and combining pieces of divided attribute information, basedon the slice additional information (S5464). Here, the pieces of dividedgeometry information, the pieces of divided attribute information, thepieces of tile geometry information, and the pieces of tile attributeinformation include frame indexes.

When the additional information indicates that tile division has beenperformed (YES in S5465), the three-dimensional data decoding devicegenerates geometry information and attribute information by combiningthe pieces of tile geometry information (the pieces of divided geometryinformation) and combining the pieces of tile attribute information (thepieces of divided attribute information), based on tile additionalinformation (S5466). Here, the pieces of tile geometry information, thepieces of tile attribute information, the geometry information, and theattribute information include frame indexes.

FIG. 53 is a flowchart of the decoding process (S5464 and S5466). First,the three-dimensional data decoding device decodes divided geometryinformation (slice geometry information) (S5471). The three-dimensionaldata decoding device then decodes a frame index for the divided geometryinformation (S5472).

When there is divided attribute information (if Yes in S5473), thethree-dimensional data decoding device decodes the divided attributeinformation (S5474), and decodes a frame index for the divided attributeinformation (S5475). On the other hand, when there is no dividedattribute information (if No in S5473), the three-dimensional datadecoding device does not perform decoding of any divided attributeinformation and decoding of a frame index for any divided attributeinformation.

Note that the three-dimensional data decoding device may decodeattribute information using a frame index or without using a frameindex.

In the following, a unit of encoding in frame combining will bedescribed. FIG. 54 is a diagram showing an example of a pattern of framecombining. The example in this drawing is an example in which PCC framesare a time series, and data is generated and encoded in real time.

Part (a) of FIG. 54 shows a case where four frames are always combined.The three-dimensional data encoding device waits until data of fourframes is generated, and then generates encoded data.

Part (b) of FIG. 54 shows a case where the number of frames to becombined adaptively varies. For example, the three-dimensional dataencoding device changes the number of frames to be combined in order toadjust the code amount of encoded data in a rate control.

Note that, if frame combining can be useless, the three-dimensional dataencoding device may not combine frames. The three-dimensional dataencoding device may also determine whether to combine frames or not.

Part (c) of FIG. 54 shows an example of a case where a plurality offrames combined partially overlap with a plurality of frames to becombined next. This example is useful when real-time processing or lowdelay is required, such as when each piece of data is transmitted assoon as the data is encoded.

FIG. 55 is a diagram showing a configuration example of PCC frames. Thethree-dimensional data encoding device may configure frames to becombined in such a manner that the frames include at least a data unitthat can be singly decoded. For example, when all the PCC frames areintra-encoded, and the PCC frames can be singly decoded as shown in part(a) of FIG. 55, any of the patterns described above can be applied.

When a random access unit, such as group of frames (GOF), is set, suchas when inter-prediction is applied, for example, as shown in part (b)of FIG. 55, the three-dimensional data encoding device may combine databased on the GOF unit as a minimum unit.

Note that the three-dimensional data encoding device may collectivelyencode common information and individual information or separatelyencode common information and individual information. Furthermore, thethree-dimensional data encoding device may use a common data structureor different data structures for common information and individualinformation.

The three-dimensional data encoding device may compare occupancy codesfor a plurality of frames after an occupancy code is generated for eachframe. For example, the three-dimensional data encoding device maydetermine whether there is a large common part between occupancy codesfor a plurality of frames based on a predetermined criterion, andgenerate common information if there is a large common part.Alternatively, based on whether there is a large common part betweenoccupancy codes, the three-dimensional data encoding device maydetermine whether to combine frames, which frames are to be combined, orthe number of frames to be combined.

Next, a configuration of encoded geometry information will be described.FIG. 56 is a diagram showing a configuration of encoded geometryinformation. Encoded geometry information includes a header and apayload.

FIG. 57 is a diagram showing a syntax example of a header(Geometry_header) of encoded geometry information. The header of encodedgeometry information includes a GPS index (gps_idx), offset information(offset), other information (other_geometry_information), a framecombining flag (combine_frame_flag), and a combined frame count(number_of_combine_frame).

The GPS index indicates an identifier (ID) of a parameter set (GPS)associated with encoded geometry information. GPS is a parameter set ofencoded geometry information of one frame or a plurality of frames. Notethat, when there is a parameter set for each frame, the header mayindicate identifiers of a plurality of parameter sets.

The offset information indicates an offset position for obtainingcombined data. The other information indicates other informationconcerning geometry information (a difference value of a quantizationparameter (QPdelta), for example). The frame combining flag indicateswhether frame combining has been performed for encoded data or not. Thecombined frame count indicates the number of frames combined.

Note that part or all of the information described above may bedescribed in SPS or GPS. Note that SPS means a parameter set based on asequence (a plurality of frames) as a unit, and is a parameter setcommonly used for encoded geometry information and encoded attributeinformation.

FIG. 58 is a diagram showing a syntax example of a payload(Geometry_data) of encoded geometry information. The payload of encodedgeometry information includes common information and leaf nodeinformation.

Common information is data of one or more frames combined, and includesan occupancy code (occupancy_Code) or the like.

Leaf node information (combine_information) is information on each leafnode. Leaf node information may be indicated for each frame as a loop ofthe number of frames.

As a method of indicating a frame index of a point included in a leadnode, any of method 1 and method 2 can be used. FIG. 59 is a diagramshowing an example of the leaf node information in the case of method 1.The leaf node information shown in FIG. 59 includes thethree-dimensional point count (NumberOfPoints) that indicates the numberof points included in a node, and a frame index (FrameIndex) for eachpoint.

FIG. 60 is a diagram showing an example of the leaf node information inthe case of method 2. In the example shown in FIG. 60, the leaf nodeinformation includes bit map information (bitmapIsFramePointsFlag) thatindicates frame indices of a plurality of points with a bit map. FIG. 61is a diagram showing an example of the bit map information. In thisexample, the bit map indicates that the lead node includesthree-dimensional points of frame indices 1, 3, and 5.

Note that, when the quantization resolution is low, there may beduplicated points in the same frame. In that case, the three-dimensionalpoint count (NumberOfPoints) may be shared, and the number ofthree-dimensional points in each frame and the total number ofthree-dimensional points in a plurality of frames may be indicated.

When lossy compression is used, the three-dimensional data encodingdevice may delete a duplicated point to reduce the information amount.The three-dimensional data encoding device may delete a duplicated pointbefore frame combining or after frame combining.

Next, a configuration of encoded attribute information will bedescribed. FIG. 62 is a diagram showing a configuration of encodedattribute information. The encoded attribute information includes aheader and a payload.

FIG. 63 is a diagram showing a syntax example of a header(Attribute_header) of encoded attribute information. The header of theencoded attribute information includes an APS index (aps_idx), offsetinformation (offset), other information (other_attribute_information), aframe combining flag (combine_frame_flag), and a combined frame count(number_of_combine_frame).

The APS index indicates an identifier (ID) of a parameter set (APS)associated with encoded attribute information. APS is a parameter set ofencoded attribute information of one frame or a plurality of frames.Note that, when there is a parameter set for each frame, the header mayindicate identifiers of a plurality of parameter sets.

The offset information indicates an offset position for obtainingcombined data. The other information indicates other informationconcerning attribute information (a difference value of a quantizationparameter (QPdelta), for example). The frame combining flag indicateswhether frame combining has been performed for encoded data or not. Thecombined frame count indicates the number of frames combined.

Note that all or part of the information described above may bedescribed in SPS or APS.

FIG. 64 is a diagram showing a syntax example of a payload(Attribute_data) of encoded attribute information. The payload ofencoded attribute information includes leaf node information(combine_information). For example, a configuration of the leaf nodeinformation is the same as that of the leaf node information included inthe payload of the encoded geometry information. That is, the leaf nodeinformation (frame index) may be included in the attribute information.

The leaf node information may be stored in one of the encoded geometryinformation and the encoded attribute information and not included inthe other. In that case, the leaf node information (frame index) storedin one of the encoded geometry information and the encoded attributeinformation is referred to when decoding the other information.Furthermore, information indicating a reference destination may beincluded in the encoded geometry information or the encoded attributeinformation.

Next, an example of the order of transmission of encoded data and anexample of the order of decoding of encoded data will be described. FIG.65 is a diagram showing a configuration of encoded data. The encodeddata includes a header and a payload.

FIGS. 66 to 68 are diagrams showing an order of data transmission and adata reference relationship. In these drawings, G(1) or the like denotesencoded geometry information, GPS(1) or the like denotes a parameter setfor encoded geometry information, and SPS denotes a parameter set for asequence (a plurality of frames). A numeral in parentheses indicates avalue of a frame index. Note that the three-dimensional data encodingdevice may transmit data in an order of decoding.

FIG. 66 is a diagram showing an example of the order of transmission ina case where frame combining is not performed. FIG. 67 is a diagramshowing an example of a case where frame combining is performed andmetadata (a parameter set) is added to each PCC frame. FIG. 68 is adiagram showing an example of a case where frame combining is performedand metadata (a parameter set) is added on a basis of frames combined.

In the header of data of frames combined, an identifier of metadata of areference destination is stored, in order to obtain metadata of theframes. As shown in FIG. 68, metadata of a plurality of frames can bebrought together. Any parameters common to the plurality of framescombined can be brought together as one parameter. Parameters that arenot common to frames indicate values for respective frames.

Information on each frame (a parameter that is not common to frames) isa timestamp that indicates a time point of generation of frame data, atime point of encoding of frame data, or a time point of decoding offrame data, for example. Information on each frame may includeinformation from a sensor that has obtained the frame data (such assensor speed, sensor acceleration, sensor position information, sensororientation, or other sensor information).

FIG. 69 is a diagram showing an example in which part of the frames isdecoded in the example shown in FIG. 67. As shown in FIG. 69, if thereis no dependency between frames in the data of the frames combined, thethree-dimensional data decoding device can separately decode each pieceof data.

When point cloud data has attribute information, the three-dimensionaldata encoding device can combine attribute information of frames.Attribute information is encoded and decoded by referring to geometryinformation. The geometry information referred to may be geometryinformation before frame combining or geometry information after framecombining. The combined frame count for geometry information and thecombined frame count for attribute information may be common (the same)or independent (different).

FIGS. 70 to 73 are diagrams showing an order of data transmission and adata reference relationship. FIGS. 70 and 71 show an example in whichgeometry information of four frames and attribute information of fourframes are combined. In FIG. 70, metadata (a parameter set) is added toeach PCC frame. In FIG. 71, metadata (a parameter set) is added on abasis of frames combined. In these drawings, A(1) or the like denotesencoded attribute information, APS(1) or the like denotes a parameterset for encoded attribute information, and APS(1) or the like denotes aparameter set for encoded attribute information. A numeral inparentheses indicates a value of a frame index.

FIG. 72 shows an example in which geometry information of four framesare combined, and attribute information are not combined. As shown inFIG. 72, geometry information of frames may be combined, and attributeinformation of frames may not be combined.

FIG. 73 shows an example in which frame combining and tile division arecombined. When tile division is performed as shown in FIG. 73, theheader of each piece of tile geometry information includes informationsuch as a GPS index (gps_idx) and a combined frame count(number_of_combine_frame). The header of each piece of tile geometryinformation also includes a tile index (tile_idx) for identifying atile.

As described above, the three-dimensional data encoding device accordingto this embodiment performs the process shown in FIG. 74. First, thethree-dimensional data encoding device combines first point cloud dataand second point cloud data to generate third point cloud data (S5481).The three-dimensional data encoding device then encodes the third pointcloud data to generate encoded data (S5482). The encoded data includesidentification information (a frame index, for example) that indicateswhether each of the plurality of three-dimensional points included inthe third point cloud data belongs to the first point cloud data or thesecond point cloud data.

With such a configuration, the three-dimensional data encoding devicecollectively encodes a plurality of pieces of point cloud data, so thatthe coding efficiency can be improved.

For example, the first point cloud data and the second point cloud dataare point cloud data (PCC frames, for example) associated with differenttime points. For example, the first point cloud data and the secondpoint cloud data are point cloud data (PCC frames, for example) on thesame object associated with different time points.

The encoded data includes geometry information and attribute informationon each of the plurality of three-dimensional points included in thethird point cloud data, and the identification information is includedin the attribute information.

For example, the encoded data includes geometry information (anoccupancy code, for example) that represents the position of each of theplurality of three-dimensional points included in the third point clouddata using an N-ary tree (N represents an integer equal to or greaterthan 2).

For example, the three-dimensional data encoding device includes aprocessor and memory, and the processor performs the process describedabove using the memory.

The three-dimensional data decoding device according to this embodimentperforms the process shown in FIG. 75. First, the three-dimensional datadecoding device decodes encoded data to obtain third point cloud datagenerated by combining first point cloud data and second point clouddata, and identification information that indicates whether each of aplurality of three-dimensional points included in the third point clouddata belongs to the first point cloud data or the second point clouddata (S5491). The three-dimensional data decoding device then separatesthe third point cloud data into the first point cloud data and thesecond point cloud data using the identification information (S5492).

With such a configuration, the three-dimensional data decoding devicecan decode data encoded with an improved coding efficiency bycollectively encoding a plurality of pieces of point cloud data.

For example, the first point cloud data and the second point cloud dataare point cloud data (PCC frames, for example) associated with differenttime points. For example, the first point cloud data and the secondpoint cloud data are point cloud data (PCC frames, for example) on thesame object associated with different time points.

The encoded data includes geometry information and attribute informationon each of the plurality of three-dimensional points included in thethird point cloud data, and the identification information is includedin the attribute information.

For example, the encoded data includes geometry information (anoccupancy code, for example) that represents the position of each of theplurality of three-dimensional points included in the third point clouddata using an N-ary tree (N represents an integer equal to or greaterthan 2).

For example, the three-dimensional data decoding device includes aprocessor and memory, and the processor performs the process describedabove using the memory.

A three-dimensional data encoding device, a three-dimensional datadecoding device, and the like according to the embodiments of thepresent disclosure have been described above, but the present disclosureis not limited to these embodiments.

Note that each of the processors included in the three-dimensional dataencoding device, the three-dimensional data decoding device, and thelike according to the above embodiments is typically implemented as alarge-scale integrated (LSI) circuit, which is an integrated circuit(IC). These may take the form of individual chips, or may be partiallyor entirely packaged into a single chip.

Such IC is not limited to an LSI, and thus may be implemented as adedicated circuit or a general-purpose processor. Alternatively, a fieldprogrammable gate array (FPGA) that allows for programming after themanufacture of an LSI, or a reconfigurable processor that allows forreconfiguration of the connection and the setting of circuit cellsinside an LSI may be employed.

Moreover, in the above embodiments, the structural components may beimplemented as dedicated hardware or may be realized by executing asoftware program suited to such structural components. Alternatively,the structural components may be implemented by a program executor suchas a CPU or a processor reading out and executing the software programrecorded in a recording medium such as a hard disk or a semiconductormemory.

The present disclosure may also be implemented as a three-dimensionaldata encoding method, a three-dimensional data decoding method, or thelike executed by the three-dimensional data encoding device, thethree-dimensional data decoding device, and the like.

Also, the divisions of the functional blocks shown in the block diagramsare mere examples, and thus a plurality of functional blocks may beimplemented as a single functional block, or a single functional blockmay be divided into a plurality of functional blocks, or one or morefunctions may be moved to another functional block. Also, the functionsof a plurality of functional blocks having similar functions may beprocessed by single hardware or software in a parallelized ortime-divided manner.

Also, the processing order of executing the steps shown in theflowcharts is a mere illustration for specifically describing thepresent disclosure, and thus may be an order other than the shown order.Also, one or more of the steps may be executed simultaneously (inparallel) with another step.

A three-dimensional data encoding device, a three-dimensional datadecoding device, and the like according to one or more aspects have beendescribed above based on the embodiments, but the present disclosure isnot limited to these embodiments. The one or more aspects may thusinclude forms achieved by making various modifications to the aboveembodiments that can be conceived by those skilled in the art, as wellforms achieved by combining structural components in differentembodiments, without materially departing from the spirit of the presentdisclosure.

INDUSTRIAL APPLICABILITY

The present disclosure is applicable to a three-dimensional dataencoding device and a three-dimensional data decoding device.

What is claimed is:
 1. A three-dimensional data encoding method,comprising: combining first point cloud data and second point cloud datato generate third point cloud data; and encoding the third point clouddata to generate encoded data, wherein the encoded data includesidentification information indicating whether each of three-dimensionalpoints included in the third point cloud data belongs to the first pointcloud data or the second point cloud data.
 2. The three-dimensional dataencoding method according to claim 1, wherein the first point cloud dataand the second point cloud data are point cloud data having differenttimes.
 3. The three-dimensional data encoding method according to claim2, wherein the first point cloud data and the second point cloud dataare point cloud data of a same object and having different times.
 4. Thethree-dimensional data encoding method according to claim 1, wherein theencoded data includes geometry information and attribute information ofeach of the three-dimensional points included in the third point clouddata, and the identification information is included in the attributeinformation.
 5. The three-dimensional data encoding method according toclaim 1, wherein the encoded data includes geometry information in whicha position of each of the three-dimensional points included in the thirdpoint cloud data is expressed using an N-ary tree, N being an integergreater than or equal to
 2. 6. A three-dimensional data decoding method,comprising: decoding encoded data to obtain third point cloud data andidentification information, the third point cloud data being generatedby combining first point cloud data and second point cloud data, theidentification information indicating whether each of three-dimensionalpoints included in the third point cloud data belongs to the first pointcloud data or the second point cloud data; and separating the firstpoint cloud data and the second point cloud data from the third pointcloud data, using the identification information.
 7. Thethree-dimensional data decoding method according to claim 6, wherein thefirst point cloud data and the second point cloud data are point clouddata having different times.
 8. The three-dimensional data decodingmethod according to claim 7, wherein the first point cloud data and thesecond point cloud data are point cloud data of a same subject andhaving different times.
 9. The three-dimensional data decoding methodaccording to claim 6, wherein the encoded data includes geometryinformation and attribute information of each of the three-dimensionalpoints included in the third point cloud data, and the identificationinformation is included in the attribute information.
 10. Thethree-dimensional data decoding method according to claim 6, wherein theencoded data includes geometry information in which a position of eachof the three-dimensional points included in the third point cloud datais expressed using an N-ary tree, N being an integer greater than orequal to
 2. 11. A three-dimensional data encoding device, comprising: aprocessor; and memory, wherein, using the memory, the processor:combines first point cloud data and second point cloud data to generatethird point cloud data; and encodes the third point cloud data togenerate encoded data, wherein the encoded data includes identificationinformation indicating whether each of three-dimensional points includedin the third point cloud data belongs to the first point cloud data orthe second point cloud data.
 12. A three-dimensional data decodingdevice, comprising: a processor; and memory, wherein, using the memory,the processor: decodes encoded data to obtain third point cloud data andidentification information, the third point cloud data being generatedby combining first point cloud data and second point cloud data, theidentification information indicating whether each of three-dimensionalpoints included in the third point cloud data belongs to the first pointcloud data or the second point cloud data; and separates the first pointcloud data and the second point cloud data from the third point clouddata, using the identification information.